jgm / typst-hs

Haskell library for parsing and evaluating typst
Other
44 stars 5 forks source link

Add more tracking of source position to the AST #27

Open dccsillag opened 10 months ago

dccsillag commented 10 months ago

I'm working on a tool that consumes Typst ASTs (ideally typed) and this package looks amazing and exactly what I was looking for. Thank you for making it available!

Currently, it looks like SourcePos is only used in the Code constructor of the Markup type (in Typst.Syntax). It would be extremely useful if this was more widely available, being able to tell where in the input each AST node corresponds to.

Unless there is some non-obvious reason for why we cannot do this, here are two proposals of how we could go about doing this:

  1. We could add a SourcePos field to essentially every AST-related constructor (seems like this could be a bit annoying); or
  2. We could add some special constructors that wrap an AST with a span. For example, we would add to the Markup constructor:

    data Markup = ... | Tick SourcePos Markup

    (This is very much inspired by what GHC does internally; in fact, I borrowed the name Tick from there. Perhaps MarkupSourcePos would be better here...)

    This is nice, because people who don't care about source position can just ignore this constructor. Also, if this causes too much of an increase in memory usage, we can add an argument to parseTypst for whether to emit these Tick constructors.

    Finally, there would be the matter of whether to keep the single already existing SourcePos field or to just reconstruct it via these Ticks. I don't have any strong opinions on this, but I imagine that a better awareness of how this API is currently used in Pandoc may be helpful in making this judgemnt :smile:.

jgm commented 10 months ago

I have no objections if you want to play around with this in a fork and let me know how it goes. Then the performance implications could be measured.

EDIT: approach 2 seems more light-weight.

dccsillag commented 10 months ago

Sure, I'll let you know when I've got something!

dccsillag commented 10 months ago

Okay, I've got an initial implementation on https://github.com/dccsillag/typst-hs/tree/more-sourcepos. I'm just struggling a bit now with the tests -- there are 1026 of them! Is there some sort of snapshot testing implemented that I'm missing?

EDIT: Ah, I now see I missed test.sh. If I understand it correctly, it will just add $t.rev files next to the Typst files ($t) with the generated output? I'm not entirely sure I get how I'm meant to use it...

jgm commented 10 months ago

The current test framework is not very ergonomic.

But for regression tests, what I do is add e.g. typ/regression/issues25.typ (which is just some typst code) and then run the tests with make test TESTARGS=--accept. I then inspect out/regression/issues25.out (generated by the test) and make sure it looks right. Note that some special code is added to your typst input to match what is expected by the typst test suite, from which most of these tests were originally taken.

jgm commented 10 months ago

Of course for your purposes it's probably better just to add some tests directly to the Haskell test runner (test/Main.hs) -- these could enable an option that adds the Tick elements, and test that they are there. Then you could leave the rest of the tests as they are.