lexical-lsp / lexical

Lexical is a next-generation elixir language server
874 stars 80 forks source link

Fixing crash when dealing with unicode files #672

Closed scohen closed 6 months ago

scohen commented 6 months ago

The change to binread introduced a fairly complicated unicode bug that would cause decoding of messages to partially fail. This was due to the fact that we were previously using read, which would occasionally return unencoded data. We patched this by doing a utf8 -> latin1 conversion, but this would fail once we acutally read the bytes rather than going through elixir's utf8 encoding friendly IO.read functions.

The change here is to get rid of the encoding step, as we're reading the raw bytes as we should.

scohen commented 6 months ago

@scottming can you take a look at this? You deal with unicode a lot more than I do, I want to make sure things are still good.

FYI, on main, opening ast_test.exs would cause my server to crash.

scottming commented 6 months ago

I did some tests with simple Chinese, and it seems to work very well, similar to the effects of the previous release.