Closed tsdh closed 1 year ago
No special configuration needed.
The point here is that you give me a means to reproduce the error. Give me instructions from Emacs -Q
, an example rust file. etc. It should be a minimal reproducible example, a way that I can reproduce your error.
Since this has started happening relatively recently to you, it might be related to the recent change for #860, which is f77518711507810b779d87190d0ca0183fc02e10. Can you try reverting that change and checking again?
@tsdh, as a workaround you can try to switch to clangd's extension of utf-8 offsets using eglot-x. Also, it seems this extension found its way into LSP 3.17 as client capability general.positionEncodings, so maybe we should implement this feature in Eglot proper. What do you think, @joaotavora?
Recipe for repro:
β― cd tmp/
β― git clone -b bar https://git.sr.ht/~tsdh/swayr
Cloning into 'swayr'...
remote: Enumerating objects: 1565, done.
remote: Counting objects: 100% (224/224), done.
remote: Compressing objects: 100% (218/218), done.
remote: Total 1565 (delta 135), reused 0 (delta 0), pack-reused 1341
Receiving objects: 100% (1565/1565), 1.45 MiB | 1.55 MiB/s, done.
Resolving deltas: 100% (1090/1090), done.
β― emacs -Q -f package-initialize -L .emacs.d/elpa/eglot-20220406.1459/ -l eglot.el swayr/swayr/src/shared/fmt.rs
Then M-x eglot
and select rust-analyzer
for the applicable language servers.
I got the issue after just moving a bit around and going to an empty line and typing fn <TAB>
.
Wrt. #860: I'm not using company-mode (but corfu
) but as shown in the recipe, it also happens with emacs -Q
. I've reverted that change anyhow so that the function reads
(defun eglot-lsp-abiding-column (&optional lbp)
"Calculate current COLUMN as defined by the LSP spec.
LBP defaults to `line-beginning-position'."
(/ (- (length (encode-coding-region (or lbp (line-beginning-position))
(point) 'utf-16 t))
2)
2))
again but I still get the freeze with above recipe. Now it seems to be even easier to get into that state. I just had to move point a bit.
@tsdh, as a workaround you can try to switch to clangd's extension
@nemethf , but Tassilo is using rust-analyzer, so how can that help?
Also, it seems this extension found its way into LSP 3.17 as client capability general.positionEncodings, so maybe we should implement this feature in Eglot proper. What do you think, @joaotavora?
I think that's a nice idea, but make a PR for it. It should in theory be identical to move-to-column, right?
Recipe for repro:
Thanks, @tsdh , that's exacly what was needed. I'm still missing rust-analyzer
, but that's OK.
@tsdh, as a workaround you can try to switch to clangd's extension
@nemethf , but Tassilo is using rust-analyzer, so how can that help?
Rust-analyzer also implemented this clangd extension. It's a bit surprising and a good sign that there's life outside of the single-company-controlled LSP specification.
Also, it seems this extension found its way into LSP 3.17 as client capability general.positionEncodings, so maybe we should implement this feature in Eglot proper. What do you think, @joaotavora?
I think that's a nice idea, but make a PR for it. It should in theory be identical to move-to-column, right?
Almost:
(defun eglot-move-to-column (column)
"Move to COLUMN without closely following the LSP spec."
;; We cannot use `move-to-column' here, because it moves to *visual*
;; columns, which can be different from LSP columns in case of
;; `whitespace-mode', `prettify-symbols-mode', etc. (github#296,
;; github#297)
Rust-analyzer also implemented this clangd extension. It's a bit surprising and a good sign that there's life outside of the single-company-controlled LSP specification.
OK, then move ahead with the PR!
Almost:
Right right :smile:
Anyway, I wonder if you @nemethf, can reproduce @tsdh problem according to his repro. Since you use rust_analyzer
:-)
Anyway, I wonder if you @nemethf, can reproduce @tsdh problem according to his repro.
Open swayr/swayr/src/shared/fmt.rs
and then M-: (eglot--lsp-position-to-point '(:line 140 :character 37) nil) RET
. Or probably it is enough the save the following into a text file:
assert_eq!(format("{:.4}", "π°π΄ππΆ", "β¦β¦"), "π°π΄ππΆ");
Eval this by C-x C-e:
(eglot--lsp-position-to-point '(:line 0 :character 37) nil)
Also I faintly remember that you had an email exchange with Eli, who suggested there's an easier way to implement eglot-move-to-lsp-abiding-column
. But I'm not sure.
Also I faintly remember that you had an email exchange with Eli, who suggested there's an easier way to implement eglot-move-to-lsp-abiding-column. But I'm not sure.
You're right. But that's an optimization, it doesn't really change the non-terminating algorithm, in principle.
Yeah, @nemethf spotted the problem. I've added a message to the do
of the loop
in eglot-move-to-lsp-abiding-column
. These are the values of (point)
, (current-column)
, and diff
which always repeat.
P:3733, CC:35, D:-1
P:3732, CC:34, D:1
The chars 3732 and 3733 are the ππΆ in "π°π΄ππΆ"
.
@tsdh if you're debugging, an interesting value to have is what this expression evaluates to:
(length (encode-coding-region (line-beginning-position) (point) 'utf-16 t))
...when point is at 3733 and 3732. It
I suspect it never evaluates to 37, at it should (at least on one of the positions).
@joaotavora, it evaluates 74 and 78. Or 66, 70, 74, and 78 when I do it on every glyph of π°π΄ππΆ.
BTW, I just noticed that my test in that file doesn't compile. After fixing it, the problem is gone, most probably because now there's no warning more for that position. I didn't even know that cargo build
succeeds if there are compile errors in test functions which themselves are contained in normal source files. Very interesting...
@joaotavora, it evaluates 74 and 78. Or 66, 70, 74, and 78 when I do it on every glyph of π°π΄ππΆ.
OK. But what I wanted to know is what position LSP requested that Emacs move to. So you should be able to find that out by inspecting the logs at the time you're doing the moving.
My suspicion is the rust_analyzer
is sending invalid numbers of UTF-16 columns (which is what it is supposed to do). The code points in that funky "swan" are probably each two UTF code units wide. So in a with 8 code points "swanπ°π΄ππΆ", LSP :character
positions 5, 7, 9 and 11 are invalid.
Nevertheless, Eglot should be robust enough to detect this.
I think that's the right event:
[server-notification] Fri Apr 8 22:41:59 2022:
(:jsonrpc "2.0" :method "textDocument/publishDiagnostics" :params
(:uri "file:///home/horn/tmp/swayr/swayr/src/shared/fmt.rs" :diagnostics
[(:range
(:start
(:line 138 :character 32)
:end
(:line 138 :character 38))
:severity 1 :code "E0308" :codeDescription
(:href "https://doc.rust-lang.org/error-index.html#E0308")
:source "rustc" :message "mismatched types\nexpected enum `FmtArg`, found `&str`")
(:range
(:start
(:line 139 :character 32)
:end
(:line 139 :character 38))
:severity 1 :code "E0308" :codeDescription
(:href "https://doc.rust-lang.org/error-index.html#E0308")
:source "rustc" :message "mismatched types\nexpected enum `FmtArg`, found `&str`")
(:range
(:start
(:line 140 :character 31)
:end
(:line 140 :character 37))
:severity 1 :code "E0308" :codeDescription
(:href "https://doc.rust-lang.org/error-index.html#E0308")
:source "rustc" :message "mismatched types\nexpected enum `FmtArg`, found `&str`")
(:range
(:start
(:line 142 :character 31)
:end
(:line 142 :character 37))
:severity 1 :code "E0308" :codeDescription
(:href "https://doc.rust-lang.org/error-index.html#E0308")
:source "rustc" :message "mismatched types\nexpected enum `FmtArg`, found `&str`")
(:range
(:start
(:line 143 :character 31)
:end
(:line 143 :character 37))
:severity 1 :code "E0308" :codeDescription
(:href "https://doc.rust-lang.org/error-index.html#E0308")
:source "rustc" :message "mismatched types\nexpected enum `FmtArg`, found `&str`")
(:range
(:start
(:line 144 :character 31)
:end
(:line 144 :character 44))
:severity 1 :code "E0308" :codeDescription
(:href "https://doc.rust-lang.org/error-index.html#E0308")
:source "rustc" :message "mismatched types\nexpected enum `FmtArg`, found `&str`")
(:range
(:start
(:line 145 :character 31)
:end
(:line 145 :character 44))
:severity 1 :code "E0308" :codeDescription
(:href "https://doc.rust-lang.org/error-index.html#E0308")
:source "rustc" :message "mismatched types\nexpected enum `FmtArg`, found `&str`")
(:range
(:start
(:line 146 :character 31)
:end
(:line 146 :character 37))
:severity 1 :code "E0308" :codeDescription
(:href "https://doc.rust-lang.org/error-index.html#E0308")
:source "rustc" :message "mismatched types\nexpected enum `FmtArg`, found `&str`")]
:version 9))
The string "swan"
starts in line 140 (LSP seems to count lines from 0) and begins at char 31 and ends at 37. So that seems correct. Well, of course I'm looking at the utf-8 encoded file.
Here's a C++ file being analyzed with clangd
swab a = 1; π°π΄ππΆπ°π΄ππΆ b;
It has 22 code points. But clangd, when reporting these diagnostics for it, correctly reports columns by counting UTF-16 code units:
(:jsonrpc "2.0" :method "textDocument/publishDiagnostics" :params
(:diagnostics
[(:code "unknown_typename" :message "Unknown type name 'swab' (fix available)" :range
(:end
(:character 4 :line 0)
:start
(:character 0 :line 0))
:severity 1 :source "clang")
(:code "unknown_typename" :message "Unknown type name 'π°π΄ππΆπ°π΄ππΆ'" :range
(:end
(:character 28 :line 0)
:start
(:character 12 :line 0))
:severity 1 :source "clang")]
:uri "file:///home/capitaomorte/Source/Cpp/thingy/stoopid.cpp" :version 2))
This means that the second wrong bit of that C++ program is located at between LSP "characters" 12 and 28 (!), whereas visually, glyph-wise, it is between 12 and 16 obviously. Eglot has to do this conversion, of course, and with Clangd's behaviour it succeeds, because Clangd correctly calculates this things.
What I would like to know is if rust-analyzer is also correctly calculating these things. If Clangd were to give a position of character e.g. of 13, 14, 15, then Eglot would be stuck as in your example.
From your diagnostics, it is still hard to pinpoint exactly what rust-analyzer is talking about.
So maybe I would ask you to make a very simple file such as this one to dispell doubts if its doing the right thing.
I don't know any Rust, unfortunately. Is that C++ program something rust-analyzer can analyze and provide diagnostics for?
rust-analyzer can only analyze rust but a very minimal rust file would be
fn main() {
let x: u32 = "π°π΄ππΆ";
}
where I can easily reproduce the issue. The event is:
[server-notification] Fri Apr 8 23:37:41 2022:
(:jsonrpc "2.0" :method "textDocument/publishDiagnostics" :params
(:uri "file:///home/horn/tmp/foo-rs/src/main.rs" :diagnostics
[(:range
(:start
(:line 1 :character 17)
:end
(:line 1 :character 23))
:severity 1 :code "E0308" :codeDescription
(:href "https://doc.rust-lang.org/error-index.html#E0308")
:source "rustc" :message "mismatched types\nexpected `u32`, found `&str`" :relatedInformation
[(:location
(:uri "file:///home/horn/tmp/foo-rs/src/main.rs" :range
(:start
(:line 1 :character 11)
:end
(:line 1 :character 14)))
:message "expected due to this")])
(:range
(:start
(:line 1 :character 11)
:end
(:line 1 :character 14))
:severity 4 :code "E0308" :codeDescription
(:href "https://doc.rust-lang.org/error-index.html#E0308")
:source "rustc" :message "expected due to this" :relatedInformation
[(:location
(:uri "file:///home/horn/tmp/foo-rs/src/main.rs" :range
(:start
(:line 1 :character 17)
:end
(:line 1 :character 23)))
:message "original diagnostic")])]
:version 0))
Again, columns 17-23 in the second line is accurate (the "π°π΄ππΆ"
) and what column-number-mode
also shows.
Again, columns 17-23 in the second line is accurate
Accurate to the common mortal sure but LSP-incorrect! It should be 2+8 =10 "LSP characters" or "UTF16code units" long, not 6! So it really seems that rust-analyzer is in the wrong here.
Again, I concede that Eglot should probably be robust, but for now I suggest you tweak eglot-move-to-column-function
and eglot-current-column-function
:
eglot-move-to-column-function is a variable defined in `eglot.el'.
Its value is `eglot-move-to-lsp-abiding-column'
Function to move to a column reported by the LSP server.
According to the standard, LSP column/character offsets are based
on a count of UTF-16 code units, not actual visual columns. So
when LSP says position 3 of a line containing just "aXbc",
where X is a multi-byte character, it actually means `b', not
`c'. However, many servers don't follow the spec this closely.
For buffers managed by fully LSP-compliant servers, this should
be set to `eglot-move-to-lsp-abiding-column' (the default), and
`eglot-move-to-column' for all others.
This variable may be risky if used as a file-local variable.
@nemethf Since you are using rust-analyzer, too, can you help me write a bug report for my minimal example?
fn main() {
let x: u32 = "π°π΄ππΆ";
}
What would be the right UTF-16-based positions? What does it actually report? Can I show the mismatch somehow without eglot/emacs, i.e., on the command line?
With the above in the example, I get this:
foo-rs on ξ main [?] is π¦ v0.1.0 via π¦ v1.60.0
β― rust-analyzer diagnostics .
processing crate: foo_rs, module: /home/horn/tmp/foo-rs/src/main.rs
Diagnostic { code: DiagnosticCode("type-mismatch"), message: "expected u32, found &str", range: 29..47, severity: Error, unused: false, experimental: true, fixes: None }
diagnostic scan complete
[ERROR rust_analyzer] Unexpected error: diagnostic error detected
diagnostic error detected
Is that range: 29..47
bogus, @joaotavora? In my utf-8 encoded file, position 29 would be the space before the double quote of the swan string and 47 is the last char in the file; I also saved that file as utf-16
and the positions are the same. (Maybe the positions have to be shifted one char to the right because emacs starts at position 1 and LSP with position 0? Then it would be the starting double-quote and EOF.)
@nemethf Since you are using rust-analyzer, too, can you help me write a bug report for my minimal example?
I hope this is enough: https://github.com/rust-analyzer/rust-analyzer/issues/11945
Is that range: 29..47 bogus, @joaotavora?
Hard to tell, I don't know what debugging output you're showing me or what that 29..47
means, since that is not an LSP message (Remember, I asked you for the LSP message,not rust-analyzer
's debugging output. You can get to the LSP messages via eglot-events-buffer
).
Anyway that file only has 39 code points in emacs (point-max
returns 39). When we're talking about column numbers the middle line has 24 visual columns but, as I explained earlier, to mark a diagnostic encompassing the "π°π΄ππΆ"
, then the LSP server should state that the diagnostics starts on LSP :character
17 and ends in LSP :character
27 of that Line.
Note that I'm not an expert in Unicode encoding schemes, but I think the encoding in which you save the file has no bearing on this whatsoever: we're only looking to translate to translate LSP's idea of columns into Emacs's ideas of columns. Despite using the :character
property name LSP's idea of columns is based on UTF16 code units -- not code points, which are roughly equivalent to characters.
I hope this is enough: https://github.com/rust-analyzer/rust-analyzer/issues/11945
More than fine, and the latest in the issues, seems to confirm this is indeed a server bug.
Again, Eglot could/should be robust to this. Not sure how, though, but it doesn't seem impossible. Have to think about it.
Rust-analyzer has fixed the underlying issue.
@joaotavora, should we close this issue? Or do you intend to work on it to make Eglot more robust in this regard?
Closing. I've been using rust-analyzer
a bit, and not seeing any of this. Eglot has tests in place, I'd say it's fair to say the issue is gone.
Closing. I've been using
rust-analyzer
a bit, and not seeing any of this. Eglot has tests in place, I'd say it's fair to say the issue is gone.
I'm not against closing it. In fact, we knew the original problem was long gone. However, this issue remained open because it wasn't clear whether you wanted to make Eglot more robust against similar issues. As you wrote:
Again, Eglot could/should be robust to this. Not sure how, though, but it doesn't seem impossible. Have to think about it.
Since maybe a week (although I'm not hacking rust with eglot daily, so it might be longer), I frequently get into a state where emacs consumes 100% of one CPU. I waited patiently for minutes but it won't get back to normal unless I press and hold
C-g
for a very long time. And oftentimes it suffices to move point a bit in my rust code buffer to get into that state again.Using
debug-on-quit
I found out that emacs is always ineglot-move-to-lsp-abiding-column
when that happens. Theloop
doesn't look like its termination criterion(zerop diff)
can always be satisfied, at least it's not obvious. I don't know, but maybe you should check at least the return value of(move-to-column column)
. If the current line has less characters thancolumn
, it will move as far as possible and return the column number where it stopped. So at least you cannot be sure that you are at thecolumn
you've specified.LSP transcript - M-x eglot-events-buffer (mandatory unless Emacs inoperable)
It's big, so not inline: eglot-events.txt
I also have the contents of the output buffer: eglot-output.txt
Backtrace (mandatory, unless no error message seen or heard):
Backtrace gathered with
debug-on-quit
then waiting a minute after emacs froze and utilized one core with 100% before hammeringC-g
:Minimal configuration (mandatory)
No special configuration needed.