executablebooks / rst-to-myst

Convert ReStructuredText to MyST Markdown
https://rst-to-myst.readthedocs.io
MIT License
59 stars 10 forks source link

Errors when translating `rst substitutions` to MyST-flavoured markdown #33

Open abravalheri opened 2 years ago

abravalheri commented 2 years ago

Describe the problem

The way rst2myst translates rst substitutions to Markdown does not seem to work properly. So far 2 inconsistencies have been identified:

  1. rst2myst relies on MyST ability to use Jinja2 for substitutions. However Jinja2 requires a valid identifier (i.e., valid Python variable names), which is a much more restrictive condition, and therefore not compatible with rst (e.g. rst accepts strings with spaces)

  2. rst2myst fails on translating links with substitutions (e.g. |ref|_), which are a very common use case when writing rst (e.g. for creating links with bold, italic or monospaced text elements)

This issue was first identified in pyscaffold/pyscaffoldext-markdown#25. A simplified description of the problem and steps to reproduce it can be found in this gist.

Link to your repository or website

https://gist.github.com/abravalheri/a9cdfa09f2809182fccca659b413da4a

Steps to reproduce

The following is a simplified example to reproduce the problem:

  1. Assuming I have the following (example) in ReStructuredText:

    #. To avoid any problems with your installed Python packages, consider using |virtual environments|_
    
    #. If everything works fine, push your local branch to |the repository service| with::
    
        git push -u origin my-feature
    
    #. Go to the web page of your fork and click |contribute button| to send your changes for review.
    
    .. |virtual environments| replace:: ``virtualenv``
    .. |the repository service| replace:: GitHub
    .. |contribute button| replace:: "Create pull request"
    .. _virtual environments: https://virtualenv.pypa.io/en/stable/
  2. When I try to convert it using the CLI (rst2myst stream original.rst > converted.md), I obtain the following MyST-flavoured markdown:

    ---
    substitutions:
      contribute button: '"Create pull request"'
      the repository service: GitHub
      virtual environments: '`virtualenv`'
    ---
    
    1. To avoid any problems with your installed Python packages, consider using [virtual environments]
    
    2. If everything works fine, push your local branch to {{ the repository service }} with:
    

    git push -u origin my-feature

    
    3. Go to the web page of your fork and click {{ contribute button }} to send your changes for review.
    

As we can see, converted markdown contains invalid Jinja2 syntax, e.g. {{ the repository service }}, and the original link with substitution was converted to a simple link, without a substitution.

The version of Python you're using

3.8.0

Your operating system

Ubuntu 18.04.5 LTS

Versions of your packages

Direct installation:

myst-parser==0.15.2
rst-to-myst[sphinx]==0.3.2
Sphinx==4.1.2

The remaining packages where installed as dependencies of those 3 in an empty virtual environment created with:

$ virtualenv -p py38 .venv
$ source .venv/bin/activate
$ pip install sphinx myst-parser 'rst-to-myst[sphinx]'

Expected output

Two different behaviours could be expected from rst2myst:

  1. The "rst substitution key" could be pre-processed to produce a valid Python identifier and in turn, that identifier could be used in Jinja2. This include replacing or removing invalid characters, e.g. |the repository service| could be translated to {{ the_repository_service }}. This approach in shown in expected_option1 bellow.
  2. The "rst substitution key" could be nested inside a dict value with an arbitrary name (e.g. __), e.g. |the repository service| could be translated to {{ __["the repository service"] }}. This approach in shown in expected_option2 bellow

Regarding links with substitutions, CommonMark's full reference links in combination with Jinja2 seem to be a good alternative, e.g. |ref|_ could be translated to [{{ ref }}][ref]. This approach is shown in both expected_option1 and expected_option2.

expected_option1

---
substitutions:
  contribute_button: '"Create pull request"'
  the_repository_service: GitHub
  virtual_environments: '`virtualenv`'
---

1. To avoid any problems with your installed Python packages, consider using [{{ virtual_environments }}][virtual environments]

2. If everything works fine, push your local branch to {{ the_repository_service }} with:

   `git push -u origin my-feature`

3. Go to the web page of your fork and click {{ contribute_button }} to send your changes for review.

[virtual environments]: https://virtualenv.pypa.io/en/stable/

expected_option2

---
substitutions:
  "__":
    contribute button: '"Create pull request"'
    the repository service: GitHub
    virtual environments: '`virtualenv`'
---

1. To avoid any problems with your installed Python packages, consider using [{{ __["virtual environments"] }}][virtual environments]

2. If everything works fine, push your local branch to {{ __["the repository service"] }} with:

   `git push -u origin my-feature`

3. Go to the web page of your fork and click {{ __["contribute button"] }} to send your changes for review.

[virtual environments]: https://virtualenv.pypa.io/en/stable/

In expected_option2, instead of using a nested dict __ inside of substitutions, the substititions dict itself could also be directly assigned to __ in the Jinja2 template context, this would allow users using any substitution key directly if they are valid identifiers and invalid identifiers via the __ helper.


(code blocks around git push ... in expected_option1|2 omitted due to GitHub's inability of dealing with nested code blocks)

welcome[bot] commented 2 years ago

Thanks for opening your first issue here! Engagement like this is essential for open source projects! :hugs:
If you haven't done so already, check out EBP's Code of Conduct. Also, please try to follow the issue template as it helps other community members to contribute more effectively.
If your issue is a feature request, others may react to it, to raise its prominence (see Feature Voting).
Welcome to the EBP community! :tada:

chrisjsewell commented 2 years ago

Heya, yeh I have to admit, I hadn't noticed the "links with substitutions" syntax before, and never used it in my own work.

Your proposed solutions both seem viable, thanks. Maybe I would lean towards expected_option1, as you end up with less verbose Markdown. But probably I would err towards replacing space with __ rather than _, just to make it slightly less likely there would be key clashes.

I can't say I have time right now to do this myself 😬 but would certainly welcome any PRs

Also interested to have a look at pyscaffold!

abravalheri commented 2 years ago

Thank you very much @chrisjsewell.

I am more or less in the same boat here in terms of time for contributing 😝, but in the future if I manage to get my hands on rst-to-myst, do you have any tips from where to start for this issue? (I am not familiar with the codebase...)


I have been using PyScaffold for years in a row now and I am really satisfied on how it helps me to creates Python packages (so much that I became one of the maintainers).

Lately I have been trying to integrate rst-to-myst so we can provide a out-of-the-box experience for people that want to write documentation in Markdown. The idea is to have an extension that translate our default .rst templates to MyST-markdown.

Let me know if you want to some information about PyScaffold, I am more than happy to chat about it 😄