Open choldgraf opened 4 years ago
Thanks @choldgraf. CCing @najuzilu who is also interested in this project.
The pandoc filters providing python
support sounds like the way to go here. I like the idea of building a robust converter and it seems to make sense to write one using pandoc
.
@chrisjsewell do you think this would be the best technology pathway for this tool based on your experience?
From reading the documentation for lua filters it supports items like block
elements which will be essential for this tool.
It looks like the python pandoc filters approach is based on building json filters
which may be more limiting than the lua
option.
Yeh but use https://github.com/sergiocorreia/panflute
Ah thanks @chrisjsewell that is the tool I couldn't remember when trying to document this issue. Thanks!
You can use https://github.com/chrisjsewell/ipypublish/tree/develop/ipypublish/filters_pandoc as an example
@najuzilu I think this is probably the best technology choice for implementing this tool.
Let's put some cycles into using panflute
and focus on this over the next couple of weeks.
http://scorreia.com/software/panflute/guide.html#
with examples above from @chrisjsewell
/ cc'ing @tarleb for visibility on this thread too.
:wave: Hi, thanks for the ping! I'd be happy to be part of this.
I agree that panflute is an excellent choice to get going. My hope would be to get MyST support into the main pandoc library some day – having a working filter would make that a lot easier, as we'd only have to translate that into Haskell.
How can I help?
thanks @arfon @tarleb -- I am Currently working through some docs to get up to speed having never worked with the pandoc AST
.
We will need:
I am working through:
Panflute
https://github.com/sergiocorreia/panflute http://scorreia.com/software/panflute/index.html
RST:
Myst:
Pandoc:
In my discussions with @chrisjsewell this past week he made some really helpful pointers. The test suite for ipypublish project contains a lot of panflute filters and some testing infrastructure which we can use
https://github.com/chrisjsewell/ipypublish/tree/develop/ipypublish/filters_pandoc
I guess the first point to understand is the execution flow:
I will put an update together next week re: submission of a working branch with some infrastructure in place to run a collection of filters etc.
One part of this approach I don't have my head wrapped around is how to update pandoc at the parser
level? Sphinx provides a lot of directives such as .. code-block::
that includes a lot of configuration for showing code blocks. From what I have read pandoc
implements base docutils
rst.
Using the following test string:
s2 = """
.. code-block:: python3
:linenos:
:emphasize-lines: 1
:name: test-block
import pandas as pd
"""
From pandoc
I am getting the following json
representation of this snippet
OrderedDict([('pandoc-api-version', (1, 17, 5, 1)), ('meta', OrderedDict()), ('blocks', [OrderedDict([('t', 'BlockQuote'), ('c', [OrderedDict([('t', 'CodeBlock'), ('c', [['', ['sourceCode', 'python3'], []], 'import pandas as pd'])])])])])])
while it picks up on CodeBlock
it doesn't seem to pass through the directive options and config. Any ideas?
This is from the pandoc documentation here:
When pandoc is used with -t markdown to create a Markdown document, a YAML metadata block will be produced only if the -s/--standalone option is used. All of the metadata will appear in a single block at the beginning of the document.
I tried it and it only works if the options begin and end with ---
or ...
.
True, pandoc currently doesn't pass the block attributes on. Would you raise an issue for this on the pandoc issue tracker?
@tarleb just checking -- do you know for sure that pandoc doesn't pass on the block attributes? I was just wondering if this is a panflute internal issue (not fetching information from pandoc
)? Thanks.
Yes, I'm pretty sure. You can try by running pandoc on the code block and ask it to return its internal representation:
pandoc --from=rst --to=native << EOF
.. code-block:: python3
:linenos:
:emphasize-lines: 1
:name: test-block
import pandas as pd
EOF
This will give [CodeBlock ("test-block",["python3"],[]) "import pandas as pd"]
, which is evidence that the parser throws that info away. In fact, if we check the source code, we see that only the number-lines
field is retained, all others are discarded. I'm not sure why it was written that way, John MacFarlane (the author) will be able to tell us more.
oh neat. thanks @tarleb -- good to know.
Has anyone had time to implement ideas from this thread? The README for this project doesn't say anything about using the filters, just suggesting rst_to_md.sh
, which converts to vanilla Markdown and recommends manual or semantically-unaware translation. Is rst2myst/filters/
in some working state? Or is there a more current recommendation for converting legacy rST documentation?
Hi @jedbrown yes I consider this essentially a deprecated project, replaced by https://github.com/executablebooks/rst-to-myst
Oh, lovely. That basically works for me, though .. dropdown
from sphinx-panels is still put inside eval-rst
. Maybe this repository can be removed since there are still some pointers to it and it'll come up first if looking for rst2myst
, which is the command name in the new repo.
Cheers, yeh I just need to a few final updates, then I can remove the "in-development" status, link to it in the myst-parser/jupyter book documentation and then will also look at archiving this repo
though .. dropdown from sphinx-panels is still put inside eval-rst.
I can look at improving the default, but also in the advanced usage section of the readme, it describes how to provide conversion configuration for "non-standard" directives
I saw that part and thought that given the -e sphinx_panels
arguments, it would be able to handle it like an admonition.
.. admonition:: Subject of admonition
Some body text
.. dropdown:: Subject of the dropdown
Some body text.
but
$ rst2myst parse -s -e sphinx_panels -f test.rst
:::{admonition} Subject of admonition
Some body text
:::
```{eval-rst}
.. dropdown:: Subject of the dropdown
Some body text.
There also seems to be unnecessary whitespace in the standard processing of admonitions, where the above could have produced
:::{admonition} Subject of admonition Some body text :::
:::{dropdown} Subject of the dropdown Some body text. :::
I can move this to the correct repository if it isn't a usage mistake. BTW, I got what I wanted (modulo excessive whitespace) by creating a `directives.yml` with
```yml
sphinx_panels.dropdown.DropdownDirective: argument_content_colon
I wonder if there should be a --verbose
mode that warns about all the directives that don't have an associated rule. It'd save time in noticing them and tracking down the correct fully qualified directive name.
I recently spoke with @arfon who mentioned that he knew of some interest in building support for MyST markdown in pandoc. I wanted to mention it here in case this would make it easier for people to port from rST to MyST. @mmcky @AakashGfude and @chrisjsewell in particular may be interested in this