cpitclaudel / alectryon

A collection of tools for writing technical documents that mix Coq code and prose.
MIT License
227 stars 36 forks source link

New role ghref for pointing to files on github #56

Closed gares closed 2 years ago

gares commented 3 years ago

I find myself adding links to stuff which is not a coqdoc generated html file, so I came up with this hack.

In my first rst comment I do:

.. role:: elpi-api(ghref)
   :pattern: ^(% \[$name|% $name|pred $name)

.. role:: lib(elpi-api)
   :src: https://github.com/LPCIC/coq-elpi/blob/master/elpi/coq-lib.elpi

.. role:: builtin(elpi-api)
   :src: https://github.com/LPCIC/coq-elpi/blob/master/coq-builtin.elpi

.. role:: elpi-type(ghref)
   :pattern: ^kind $name

.. role:: type(elpi-type)
   :src: https://github.com/LPCIC/coq-elpi/blob/master/coq-builtin.elpi

and then I can easily put links to my (builtin or defined) APIs or data type declarations.

The first one `Elpi Command hello.` sets the current program to hello.
Since it is declared as a `Command` some code is loaded automatically:

* APIs (eg :builtin:`coq.say`) and data types (eg Coq :type:`term` s) are loaded from
  `coq-builtin.elpi <https://github.com/LPCIC/coq-elpi/blob/master/coq-builtin.elpi>`_
* some utilities, like :lib:`copy` or `whd1` are loaded from
  `elpi-command-template.elpi <https://github.com/LPCIC/coq-elpi/blob/master/elpi/elpi-command-template.elpi>`_

The pattern is used to locate the line (in the raw file) so that the href can point to $file#L$line, in turn GH highlights the line.

Todo:

Feedback is welcome.

Disclaimer: last time I wrote some python it was version 1 (yes, there were no objects) so my code is likely to suck.

cpitclaudel commented 3 years ago

The code looks good. Some high-level questions:

gares commented 3 years ago

Thanks, I did update the code and implemented permalinks.

Do you use plain docutils or do you use Sphinx?

I'd like my tutorials to stay .v files, so I don't know ;-) I call alectryon... I'm pretty clueless on doc tools, I'm happy If I can stay ignorant on this matter ;-)

The cache isn't persisted, so this will re-fetch the github page every time the document is built, right? We'd need to be a bit careful (for example alectryon.el runs docutils in the background while you edit a file in Emacs, so it's not ideal if it every time it runs it connects to the internet)

Sure, but I have no clue on how to make it persistent. Any pointers? It now contains the permalinks, so we can check for outdated entries.

Did you consider using Chrome's new "link to text" feature? (Firefox doesn't implement it yet, but they're planning to, at some point): For example here's a link to "coq.say" https://github.com/LPCIC/coq-elpi/blob/master/coq-builtin.elpi#:~:text=coq.say

This looks like a very natural feature, I did google it but could not find it. I still see a few advantages in doing things this way:

What you propose makes a lot of sense for non-GH links, IMO.

gares commented 3 years ago

@cpitclaudel I'd like some hints on the cache thing, and any other comment on this PR, so that I can finish it.

cpitclaudel commented 3 years ago

Yep, will get to it. It took me a while because I was working on the Pygments stuff first ^^

cpitclaudel commented 3 years ago

Just had a new look, the code looks reasonable (though an API that involves eval is a bit scary).

On the cache: Sphinx has decent support for caching natively; for a more general solution you could just open a file and update it every time you try to resolve one of these links. If that's too bad for performance (opening and writing a file for every such link), then you would use a transform: you'd delay most of the work of resolving links and then resolve all links at once (along with updating a cache) in a "Transform". Not sure if it's worth the effort.

I'm expecting this code will evolve as the use-case gets refined, and I'm not entirely comfortable with running arbitrary code every time Alectryon compiles a document, so I'm not sure about merging this as-is. For your use case, if Alectryon had a --plugin flag that you could pass a Python file to, would it be enough? That's easy to add, and custom plugins are a pretty common use case (I originally thought people would use custom drivers built on top of Alectryon for things like that, but a --plugin option might be nicer?).

gares commented 3 years ago

Hum --plugin is fine, but I prefer to clean it up if it can be merged, eventually. You tell me.

About eval, I think a replace expression is more than enough, I just don't know which is the best way to pass something like s/this/that/.

About the cache, I can write to a file. I was just wondering if alectryon already had a cache (invalidation) mechanism to hook to.

gares commented 3 years ago

I should have addressed all issues. The cache is still to be done.

cpitclaudel commented 3 years ago

Hum --plugin is fine, but I prefer to clean it up if it can be merged, eventually. You tell me.

I'd prefer to keep this external for now, and if it catches on we can merge it. Let me know if that works for you.

About eval, I think a replace expression is more than enough, I just don't know which is the best way to pass something like s/this/that/.

Probably as an argument with that s/…/…/ syntax, parsed with a regexp.

About the cache, I can write to a file. I was just wondering if alectryon already had a cache (invalidation) mechanism to hook to.

There is, but it's only keeping track of the Coq parts, not of anything related to docutils. Sphinx has an extensible caching mechanism, but it doesn't work with plain Docutils. I think your best bet is a separate cache file.

gares commented 3 years ago

I'd prefer to keep this external for now, and if it catches on we can merge it. Let me know if that works for you.

OK, thanks for you reviews by the way.

I guess the --plugin will let me both register one extra role and monkey patch the pygment style, right?

gares commented 3 years ago

Probably as an argument with that s/…/…/ syntax, parsed with a regexp.

I changed that already into 2 options, the second one defaulting to the empty string which is what I currently need.

gares commented 3 years ago

I'd prefer to keep this external for now, and if it catches on we can merge it. Let me know if that works for you.

I guess the --plugin will let me both register one extra role and monkey patch the pygment style, right?

please have a look at the last commit to see what I mean

cpitclaudel commented 3 years ago

OK, I've looked into nested parser. I pushed an example in recipes/custom_driver.py, and I've improved the documentation for custom driver. Can you have a look? I tried to make the example as close as possible to your use case.

I think that's the most flexible solution, since with a plugin there's always the risk that things get imported in the wrong order, whereas with a custom driver your code is guaranteed to run first. I think that's the right solution, but if after looking into the new documentation and example you think a --plugin flag would be much better I can add that, too.

gares commented 3 years ago

Thanks, I'll look at it more carefully, but after a quick look I still don't get how you go back to he host language from, say, CLexer. For example in https://github.com/cpitclaudel/alectryon/blob/95a31c4232e71047a64956976622969df9226381/recipes/custom_highlighting.v#L54 you nest C inside Gallina, but no Gallina inside C.

gares commented 3 years ago

As a reference, see the second code block here: https://lpcic.github.io/coq-elpi/tutorial_coq_elpi_HOAS.html#quotations-and-antiquotations

cpitclaudel commented 3 years ago

you nest C inside Gallina, but no Gallina inside C.

Would it not work to mark the Gallina-inside-ELPI tokens as "other"? IOW, you don't get back into Gallina from ELPI; instead, you mark all Gallina beforehand. Would that work?

gares commented 2 years ago

Now I recall what my problem was. In order to use using you need to find the end of the embedded language. You do:

            ('(C)(:[{][{])(.+?)([}][}])',
             bygroups(token.Keyword, token.Operator, using(CLexer), token.Operator)),

but this does not work for me since C:{{ bla {{ more }} extra }} would run CLexer on bla and more but not extra. So, unless python regex can count, I need to use the pygments stack machine, I believe.

cpitclaudel commented 2 years ago

Thanks, you're right, and sorry for being a bit slow at this.

I think your approach is likely to be the best there is. What you could do, for sure, is a three-step process: first use a simple lexer that tags each bit of code as Coq or ELPI, and then use two nested DelegatingLexers to highlight both of these separately. But that would be neither much simpler nor cleaner, and worse, it's arguably a bit less correct, because each individual parser sees a truncated version of the source (if you have Definition a := lp:{{ xyz {{ Some n }} abc }}, what the Coq lexer will see is Definition a := Some n ., and what the ELPI lexer will see is xyz abc, and neither of those are "right" (in corner cases that might even confuse the lexers entirely, for example in Definition b := lp:X. the Coq lexer would see an invalid sentence Definition b := .

So your approach is good, I think. I will leave specific comments inline.

cpitclaudel commented 2 years ago

Yup, just went through everything. I think your approach works nicely, and I can't think of something much smoother.

gares commented 2 years ago

Thanks for all the help, I could move to a custom driver: https://github.com/LPCIC/coq-elpi/pull/283 Looking forward to a release including --pygments-style, so that I can drop the ugly pinning.

cpitclaudel commented 2 years ago

Woohoo! Thanks so much for your patience and for helping me understand the issue(s) :) The new tutorials look great.