Open jeertmans opened 1 year ago
Thanks for opening an issue, this is very interesting. Splitting text and markup should be fairly easy to achieve just with the typst-syntax
crate. Like you, I'm not sure whether it makes sense to integrate it directly into the compiler, but it could be cool. Let me take a look at the links you posted and think a bit about this.
Thanks for your quick response! I am quite new to Typst, so I really know what the best implementation solutions are for your tool. However, I really love the idea behind your tool (especially because you use Rust ^^'), and I'd love to help if I can.
Someone modified LTex a bit to be compatible with typst. That way you get spelling and grammar correction directly inside vscode. It still shows a bunch of false positives, but it's very helpful overall. I'd really love to see a proper version of this for typst
I wrote https://github.com/antonWetzel/typst-languagetool, because I am lost without spellcheck. At the moment only a local LanguageTool server is used, because I use it this way. The result can be human readable or consumed by a VSCodium/VSCode problemMatcher.
If there is interest, I can improve the tool for general use.
I don't like the dependency on a third party or a local server (with java), but I could not find a good open source (+Rust) spellchecker.
Nice @antonWetzel! Do you have a small movie / GIF that showcases how it works?
typst-lt check document.typ #check one time
typst-lt watch docuemnt.typ #watch for changes (can be folder)
... --language=en-US #change language
... --rules=typstls.json #use rules file
https://github.com/typst/webapp-issues/assets/59712243/afc93296-28c3-4f10-b6bb-e1084930e22f
https://github.com/typst/webapp-issues/assets/59712243/fa8cf120-0dc4-4296-af36-dfbcbb97fa0d
Very nice, thanks! Does it automatically handle large files?
LT limits the text length for a given check request. Ultimately, you can solve that by splitting the text into chunks.
I tried to add this feature to LTRS, but I recognize this might not be perfect atm.
I split on parbreak after atleast $10'000$ letters. Local server has no limit (I think). Longer files crashed on conversion to http requst, because the full text (annotated json) is encoded in the request url.
Yes indeed, but the free online service is limited to 1500 characters (or words), if I remember correctly. 10000 letter seems quite a lot to be honest, and splitting into multiple processes / threads might help improve performances.
Description
Hello, this is quite in line with typst/webapp-issues#7, but I would like to take it a step further to suggest integrating LanguageTool (LT) using LanguageTool-Rust. I am myself the author of this crate, and I wanted for a long time to integrate LT to LaTeX, or an equivalent tool.
Context
One very annoying thing with LaTeX is its complexity when trying to write external-tools (like linters, grammar checkers, formatters, and so on), mainly because it is hard to extract text content from a given TeX file. With Typst, because it is written in a more modern way, I guess that external tool integration should be easier, especially for tools written in Rust too.
How
LanguageTool has a very nice feature which is checking markup text (from http-api):
In LanguageTool-Rust, the implementation looks as follows:
https://github.com/jeertmans/languagetool-rust/blob/04eea4cdb9c7cde70e553b262cc51dd558fc6aa6/src/lib/check.rs#L124-L138
The idea would then to provide some function, e.g.,
extract_markup
, that would read a Typst file and return an appropriate data structure that escapes all non-text characters using themarkup
field.Note about LanguageTool-Rust
Even if the LanguageTool-Rust crate is quite in a good state, tested against a variety of cases, it may not be perfect, and I am open to make modifications if that can help to integrate LT to Typst.
Integration with CLI
If possible, it would be nice to provide CLI commands, like
typst check grammar
that returns an annotated text with grammar errors found by LT.Summary
I think that spell checking tools are getting so good that it's very interesting to provide easy integration for them. LT is an open source, widely used, so I think it is a good candidate.
The implementation I proposed above are just ideas, and I am open to criticism or other suggestions :-)
Use Case
Actually, this feature would not be limited to the Web App, but I don't know if this should be integrated directly into the compiler.
The basic use case is that human make mistakes, and why not use tools to help reduce them? LT also provides some rewrite suggestions, which is nice.