sphinx-contrib / matlabdomain

A Sphinx extension for documenting Matlab code
Other
63 stars 44 forks source link

Use textmate grammar instead of pygments #244

Open watermarkhu opened 4 months ago

watermarkhu commented 4 months ago

Hi @joeced, great work on maintaining this repo.

A year ago, I wanted to contribute to support argument blocks. However, I've found that the logic in mat_types.py based on the Pygments tokens to be very hard to work with, and a bit unstable.

Following MathWorks' support for VSCode, I had started on working a parser based on TextMate grammars using Python, which is used for syntax highlighting in VSCode. MathWorks is now also maintaining the MATLAB grammar.

The package is available at https://github.com/watermarkhu/textmate-grammar-python. If you are interested, I think this can be a good replacement for the currently in-house parsing of matlabdomain. The benefit of using TextMate grammar is that 1) due to its nested nature, the output is already a syntax tree and 2) parsing is now officially supported by MathWorks and the contributors of the VSCode extension.

On a different topic, due to some requirements, I will need to have an auto-documenter that is compatible with markdown docstrings. To this end, I've already started work on a new extension that is dependent on the myst-parser and based on autodoc2. I would love to get in touch with you to understand the matlabdomain better to see what I can re-use.

joeced commented 4 months ago

Hi @watermarkhu

This looks really interesting. At the moment I started tackling #44 and #222, and the pygments token output is just a mess to start parsing. I'll give a shot a see if it can replace pygments and then improve the functionality.

Regarding starting up a new auto-documenter, I can only tell how this domain was started. The original author basically built the documenter directly upon autodoc for Python. This gave them a good start and basis. However, the code is not the easiest to work with in my opinion. We still run into features that needs to be reimplemented, for instance #180. Even after maintaining the package for many years now, I still struggle with the Sphinx internals of Documenter and Directives 😵.

A different approach for autodoc is done in https://github.com/mozilla/sphinx-js. I hope this helps.

watermarkhu commented 4 months ago

Good to hear!

I'm currently mostly struggling with setting up roles in a new domain in order to make cross-referencing possible eventually. Can we possibly setup a call?

joeced commented 4 months ago

I tried textmate-grammar-python and looks way nicer with the tokenization (see https://github.com/sphinx-contrib/matlabdomain/issues/222#issuecomment-1991587364). Definitely makes it easier to deduce if it's method definition in an abstract class. Further, I really like the nested dictionaries, where I can just skip the body of a function, once I collected what I need. It will require a lot of re-writing, but I'm quite tempted by it.

We can setup a call, but be warned I am by no means an expert in the cross-referencing. You can contact me at jorgen at cederberg dot be.

joeced commented 4 months ago

@watermarkhu Two comments to https://github.com/watermarkhu/textmate-grammar-python:

Do you want me to add them as issues?

watermarkhu commented 4 months ago

Good to see! Adding the issues would be great.

Let's discuss about 3.9 support on the PR that you submitted.

apozharski commented 4 days ago

Hello, not to step on any toes here but I would like to know if this effort has stalled (understandably, time is always a valuable commodity)? The matlab library I am a maintainer of is currently going through a major documentation pass and to that end I have allocated some time to working on tooling. As such, I think this would be a good place to start as it will help in closing #52, #54, #212, and #222.

Those four issues are currently my target to get done (perhaps in one fell swoop along with this one) as they would be very useful for our documentation. I have started an attempt to implement classdef class parsing working here and it seems like it should be doable to replace the current parsing code with something (at least marginally) better. Let me know if I have misread the situation.