astral-sh / ruff-vscode

A Visual Studio Code extension with support for the Ruff linter.
Other
956 stars 45 forks source link

Lint/Format/Fix failed - error: Ruff crashed, with lines starting with non-Latin characters in jupyter notebook #369

Closed lethefrost closed 7 months ago

lethefrost commented 7 months ago

In jupyter notebook, ruff keeps crashing when you have any line (in code cell) starting with a non-Latin character. Those characters in the middle of a line or as trailing characters doesn't trigger this bug.

To be specific, I have tested with Chinese, Japanese, Korean, Hindi, and they all lead to this issue. I have also tested French, Spanish, Italian (with accented characters), German, Arabic, and Russian, and they survive from crashing. (I believe Arabic, Russian, and German are not Latin languages though. There must be some bug against Asian characters šŸ˜‚šŸ˜‚)

The use case of lines starting with other languages mainly happens in dosctrings, but some people also use their mother languages to name variables. Therefore, it's a rather common thing to expect in a notebook.

The error message is as the following:

thread 'main' panicked at /rustc/a28077b28a02b92985b3a3faecf92813155f1ea1/library/core/src/str/mod.rs:660:13:
byte index 2 is not a char boundary; it is inside '<the buggy char>' (bytes 0..3) of `<the buggy line>`

Here is a test example with some Chinese characters for your convenience to reproduce this bug:

def test():
    """굋čƕ <- the line starts with " so the chinese character dosen't trigger the bug
    굋čƕ <- this line starts with 굋, and should cause ruff to crash
    """
    pass

I am working on macOS 14.2, VSC 1.85.1, Python 3.12.0, jupyter_core 5.5.0, Ruff v2023.58.0.

Thank you for your efforts!

konstin commented 7 months ago

I think this same bug as fixed in https://github.com/astral-sh/ruff/pull/9146

lethefrost commented 7 months ago

I think this same bug as fixed in astral-sh/ruff#9146

Thanks!