RedPRL / asai

🩺 A library for compiler diagnostics
https://ocaml.org/p/asai
Apache License 2.0
35 stars 2 forks source link

📍 Generalize `Range.eof` to arbitrary positions #161

Closed favonia closed 3 weeks ago

favonia commented 1 month ago

The idea is to generalize EOF (end of file) to an arbitrary position in a file. This includes the recognition of these special positions:

  1. The EOF (end of file)
  2. The EOL (end of line), BOL (beginning of line) and BOF (beginning of file)

The plan is to change the semantics of "empty ranges" (with the same starting and ending positions). Previously, an empty range is a no-op, highlighting nothing. With this proposal, an empty range will mark one single position.

To implement this, there will be several API changes:

  1. Range.{eof,view} can/will be deprecated. Maybe Range.of_pos or Range.single (?) can replace Range.eof.
  2. The terminal backend should be updated to handle all special cases EOL, BOL, and BOF. (It's unclear how to highlight positions in the middle.)
  3. The flattening function needs to be rewritten to avoid the generation of spurious empty ranges. They were harmless earlier because they highlight nothing in the current semantics.

Design question: how to highlight a position in the terminal output? This may eventually lead to a solution to #131.

Tagging @mikeshulman for possible feedback.

mikeshulman commented 1 month ago

Seems like a good idea, but I don't have much to contribute. Why is it easier to highlight EOL and BOL than positions in the middle?

favonia commented 1 month ago

@mikeshulman Mostly because we don't have to come up with a good name (e.g., EOL, BOL, ...). It's equally difficult at the technical level.

mikeshulman commented 1 month ago

Haha, okay. Maybe just POS.

favonia commented 1 month ago

@mikeshulman Question: is it important for your applications to display "EOF" as "EOF"? I realized it's far more easier to uniformly display these point markers as "POS1" or some generic names. However, I wonder if this will sacrifice usability.

Some of the forms I'm currently considering:

let x◂POS₀▸ = 1 in let x = 1 + x◂POS₁▸ // without delimiters
‹let x◂POS₀▸ = 1› in let ‹x› = «1 + ‹x›»◂POS₁▸ // with delimiters
‹let x◂POS₀▸ = 1›₀ in let ‹x›₁ = «1 + ‹x›₂»◂POS₁▸ // with numbered delimiters
mikeshulman commented 1 month ago

I do think "EOF" is clearer. In my case (and, I suspect, in others) EOF is the only zero-width position possible -- it indicates a parse failure when something was expected but nothing was found; whereas in any other situation something would be found, and if it's the wrong thing, the error would be reported on the something. So simply marking it with "POS" would not indicate to the user what's actually going on.