`rust-syntax-propertize` assumes text already has the properties it is responsible for placing

jimblandy commented 1 year ago

The rust-syntax-propertize function is responsible for setting syntax-table text properties on a given region of text. However, it fails to do its job properly, because it uses syntax-ppss, which assumes that syntax-table properties have already been set.

The syntax-ppss function distributed with Emacs calls syntax-propertize-function to place syntax-table properties as needed, so that when rust-match-angle-brackets is enabled, parse-partial-sexp can distinguish between < and > characters surrounding generic parameters and those that are comparison operators or -> or => tokens. When they enclose generic parameters, syntax-ppss treats < and > tokens as opening and closing "parentheses", in the sense used by Emacs syntax table code.

However, in rust-mode buffers, syntax-propertize-function is rust-syntax-propertize, which ends up calling syntax-ppss itself. Unbounded recursion is prevented by syntax-propertize temporarily binding syntax-propertize--done to most-positive-fixnum, but this also means that the propertization doesn't get done.

I don't understand exactly how this happens yet, but I have verified that parse-partial-sexp is being called from rust-syntax-propertize on regions of text whose -> tokens have not been marked with syntax-table text properties.

jimblandy commented 1 year ago

I forgot to mention the symptom: after using fill-paragraph on a comment, indentation is incorrect: hitting TAB on a top-level comment line results in a ten-space indentation, when it should be zero. This is because syntax-ppss is returning garbage results for this location (a paren nesting depth of -44, for example).

jimblandy commented 1 year ago

This bug is hard to trigger because it depends on rust-macro-scope first polluting the syntax-ppss-wide cache with incorrect information, which it will only do if the position that syntax-ppss is asked to analyze is at least syntax-ppss-max-span characters past the last position requested. The decision to populate syntax-ppss-wide also depends on whether the new position is more than twice the average length of all the scans performed by syntax-begin-function so far in this Emacs session. (The code in syntax.el seems quite overconfident.)

rust-lang / rust-mode

`rust-syntax-propertize` assumes text already has the properties it is responsible for placing #465