microsoft / RTVS

R Tools for Visual Studio.
MIT License
390 stars 110 forks source link

Spell Checker support for .Rmd files #3727

Open mkoohafkan opened 7 years ago

mkoohafkan commented 7 years ago

It would be great to have spell-checker support for .Rmd files. VSSpellChecker is a flexible spell-checking extension that currently interfaces with a wide variety of extensions, including MarkdownEditor. perhaps RTVS could provide the necessary interface for this extension? See discussion in VSSpellChecker Issue 125 for additional details.

MikhailArkhipov commented 7 years ago

I replied there. Not sure what extension do they need as RMD classifier is a standard VS classifier, basically same as HTML since I wrote both.

mkoohafkan commented 7 years ago

The discussion seems to have stalled. Right now it is possible to force Rmd files to be opened by MarkdownEditor extension, which brings in spell checker support as defined by the standard Markdown classifier. This won't be a good solution once Rmd live preview is implemented in RTVS though.

MikhailArkhipov commented 7 years ago

This is something extension author needs to handle. All classifiers in VS are based on the same principle. However, I tried R, SQL and C++ and spell checker does not work there either. From the code it looks like

a) For some file types extension parses code itself. For example HTML (Html Agility Pack) or XML. This is probably because underlying classifiers do not provide NaturalLanguage classification (text inside tags is just black). b) Some files, like text, are treated as natural language completely c) In other languages it appears to rely on specific classification spans existence, such as comments or natural language spans i.e. PredefinedClassificationTypeNames.NaturalLanguage

I am not aware of VS editors providing special interfaces to the extension.

RMD, R, RD, HTML, CSS, LESS, SCSS and some other VS classifiers rely on the same code I wrote some years ago.

It doesn't appear to handle markdown correctly either. This is probably because bold, italic etc spans are not actually marked as natural language.

image

Long long time ago I provided my own spell checker (no longer supported). It also support output errors to the task list and supported right-click correction... Unfortunately, I didn't have much time to maintain it.

mkoohafkan commented 6 years ago

@MikhailArkhipov that looks like it was a pretty slick spell checker... Is the source code available? Maybe it could be turned into a community effort to support spell checking for Rmd, Rnw, and perhaps even Roxygen text.

MikhailArkhipov commented 6 years ago

It probably would have a little value today. It relied on old HTML editor interfaces (C++/COM) and that code base is only used in ASP.NET classic designer in VS (aka Web Forms). Razor etc has new C# codebase I wrote back for VS 2012. One could learn how to use modern VS HTML parser looking at Mads' Web Essentials https://github.com/madskristensen/WebEssentials2015. Then just use .NET language packs or (requires Office) use Office API to spell check - Office supports many more languages.