BetterThanTomorrow / calva

Clojure & ClojureScript Interactive Programming for VS Code
https://marketplace.visualstudio.com/items?itemName=betterthantomorrow.calva
Other
1.68k stars 217 forks source link

Scanner test error #1974

Open bpringe opened 1 year ago

bpringe commented 1 year ago

The CI run where the failure happened: https://app.circleci.com/pipelines/github/BetterThanTomorrow/calva/5816/workflows/f07dd6cc-9ae0-4911-b1e6-8a32ae66eee2/jobs/26931

image

There was also another failure that appears to show the expected output equaling the actual, but I suspect maybe there's some unseen difference. Maybe a change to how input for the tests is generated is warranted? I don't know if it's easy to find what went wrong there though in order to make the change.

image

PEZ commented 1 year ago

Thanks!

It should be pretty easy to reproduce this. What I tend to do is create a non-property test with the input that the generator has found. Just looking at such a test can help in determining if it is the generator that should be tweaked or if we should handle the input. It is most often the latter.

In this case it is tokenizes literal unicode characters that exposes an issue with some input:

  1) Scanner
       simple
         tokenizes literal characters
           tokenizes literal unicode characters:
     Property failed after 1 tests
{ seed: -2050306983, path: "0:13", endOnFailure: true }
Counterexample: ["\\
"]
Shrunk 1 time(s)
Got error: Error: expect(received).toBe(expected) // Object.is equality

We have both the seed and the counterexample. In this case there is some funny character at the end here that is not visible to the eye in the output, but if I copy it and paste it in the terminal it is pasted looking like a space character:

image

But looks like so when pasted here:

bash-3.2$ echo -n '\\
' | od -h
0000000      5c5c    80e2    00a8
0000005
bash-3.2$

In contrast here is with just two backslashes:

image
bash-3.2$ echo -n '\\' | od -h
0000000      5c5c
0000002

My guess is that if we would paste this counter example in a Clojure file, Calva might lock up completely, or just start to behave really funny around it. We have a catch-all in the scanner that should handle this, but it obviously fails when it is escaped with one (or maybe it takes two) backslashes. This is bad of course, but these days VS Code helps in highlighting funny unicode characters so the user should at least get a clue.