bartosz-antosik / vscode-spellright

Multilingual, Offline and Lightweight Spellchecker for Visual Studio Code
Other
360 stars 37 forks source link

Support Pandoc / R Markdown / Quarto special features #588

Open allefeld opened 2 months ago

allefeld commented 2 months ago

Documents in Pandoc Markdown including R Markdown and Quarto have a few special features which currently interfere with the spell checking. It would be great if the extension could be adapted to these features.

As a workaround, I defined the following "ignore" regular expressions:

    "spellright.ignoreRegExpsByClass": {
        "markdown": [
            "/\\\\(?:begin|end){.*?}/g",
            "/\\\\[a-zA-Z]*\\(?/g",
            "/^---\\n[^]*?\\n---$/gm"
        ],
        "quarto": [
            "/\\\\(?:begin|end){.*?}/g",
            "/\\\\[a-zA-Z]*\\(?/g",
            "/^---\\n[^]*?\\n(?:---)$/gm"
        ]

But according to the README such expressions "may have serious impact on performance", so maybe a built-in solution would be better?

For some reason I don't understand, I had to include \\(? in the second regular expression, because otherwise e.g. \sin in \sin(x) is not ignored. More strangely, in such a case the spellcheck does not complain about \sin, but about ____ (four underscore characters):

image

connortwiegand commented 1 month ago

For an internal solution, It looks like the main files to modify would be lib/parsers/markdown.js and lib/doctype.js. Two ideas come to mind. One is calling quarto YAML headers "comments". This may require adding comments to the markdown parser. The other idea is to add the suggested regexps as additional delimiters of a "code" block. Unsure if that would impact performance significantly (my guess is not).

allefeld commented 1 month ago

I've been using these regexps for a while now, and I did not notice decreased performance. I still think this would be a good improvement, but basically I'm fine for now, so it might be enough to mention my regexps (or better versions) in the documentation.