Myriad-Dreamin / tinymist

Tinymist [ĖˆtaÉŖni mÉŖst] is an integrated language service for Typst [taÉŖpst].
https://myriad-dreamin.github.io/tinymist
Apache License 2.0
853 stars 35 forks source link

Incorrect report of `incorrect delimiter` error when unicode characters are used beforehand #602

Open Iorvethe opened 2 months ago

Iorvethe commented 2 months ago

Describe the bug This is best understood with a screen-cast as show here using the following file content.

$š’Ŗ$ _text_ text

The steps to reproduce are:

  1. Open the file
  2. Move the delimiter to the end
  3. Undo and redo the operation multiple times of moving the delimiter
  4. Error pops up

Another way to trigger it is to change the š’Ŗ by a O, as show here.

Package/Software version:

tinymist extension version: v0.11.1. Get it by tinymist --version in terminal.

tinymist
Build Timestamp:     1980-01-01T00:00:00.000000000Z
Build Git Describe:  VERGEN_IDEMPOTENT_OUTPUT
Commit SHA:          VERGEN_IDEMPOTENT_OUTPUT
Commit Date:         None
Commit Branch:       None
Cargo Target Triple: x86_64-unknown-linux-gnu
Typst Version:       0.11.1

tinymist -V

tinymist 0.11.20
Iorvethe commented 2 months ago

I think it has something to do with the indices as working on a bigger document, and doing similar edits, the indices seem to accumulate quite a bit, up to the point of the parser missing certain tags or headings entirely (this is seen, for example, when searching the document symbols with tinymist).

Myriad-Dreamin commented 2 months ago

I don't get it when I see the screen cast, because it looks so weird. But I will explore it by myself.

Iorvethe commented 2 months ago

Thanks for looking into it! And sorry for the screen-casts of mediocre quality, it appears that not everything I see on my terminal is on the casts (the underline for the errors, specifically), and I realize that the chain of actions are not very clearā€¦

I think that the second case is perhaps easier to reproduce, and is essentially the same issue. So, with the following file:

$š’Ŗ$

The steps are to:

  1. Open the file
  2. Delete the inside of the equation
  3. Type any other character
  4. Tinymist reports an unclosed delimiter at the position of the first $

I hope this is easier to reproduce, and feel free to ask me for any other information.

Edit: on a second thought, maybe the issue is with the editor. For your information, I use Helix.

Myriad-Dreamin commented 1 month ago

I have checked it for several times but I cannot reproduce it in vscode. And I cannot understand why diagnostics show and disappear just when we are moving cursor in helix.

Iorvethe commented 1 month ago

Thanks for spending time on it! Itā€™s a bummer that it canā€™t be triggered on VSCodeā€¦ As for the diagnostics in Helix, I think that they are always present, but shown on-screen only when the cursor is on a position for which there is a diagnostic. In this case, itā€™s on the first delimiter, but something weird happens when an additional $ is added at the end. The diagnostic moves all the way over the additional $. See the video below.

report.webm

Here are a few observations that may be relevant:

  1. Deleting the character, to trigger the unwanted diagnostic, and entering it again doesnā€™t remove the diagnostic.
  2. However, restarting the LSP (or the editor) removes it.
  3. Similarly, deleting the whole delimited section (emphasis, equation, etc.) also removes the diagnostic.
  4. There isnā€™t a similar problem with other markup LSPs (marksman, texlab) which is why it led to believe initially that it was an issue with tinymist and not with helix.
  5. It seems to be triggered by specific subsets of the Unicode standard. For example, symbols from this subset trigger it consistently. From a limited testing, it appears that all blocks from Unicode 3.1 are problematic, but older ones, such as Unicode 1.1, seem alright.

Observations 1 and 2 maybe hint at an issue with the incremental update? However, I have knowledge of neither Helix nor Tinymist, so Iā€™ll refrain from further wild guessing. But, if I can be of any help, please let me know!