miyuchina / mistletoe

A fast, extensible and spec-compliant Markdown parser in pure Python.
MIT License
818 stars 118 forks source link

Keep list item indentation in MarkdownRenderer #213

Closed masalim2 closed 5 months ago

masalim2 commented 6 months ago

Hello! I'm wondering if it's possible to adjust the MarkdownRenderer's output spacing of indented list items to preserve four spaces of indentation in nested lists.

Using mistletoe==1.3.0 on the following input file:

1. A
    - B (4 leading spaces)

I obtain the following output from running mistletoe test.md --renderer mistletoe.markdown_renderer.MarkdownRenderer:

1. A
   - B (3 leading spaces!)

It seems like the 3 space indentation level on the nested list is coming from the prepend attribute, which is set such that continuation lines are aligned with the content of the previous line. This sort of makes sense, given that the inner list is really a child of the first list item.

However, I'd like the nested list item to render with the original 4 spaces of indentation, because I'm using mistletoe to preprocess some markdown for a downstream processor (mkdocs/python-markdown) which is fairly rigid in wanting 4 spaces or a tab for any block-level elements nested in a list.

I'm new to the codebase: poking around the MarkdownRenderer class and render_list_item()implementation, it isn't obvious how one could introduce an option like this. It seems a bit tricky to recognize that when the child of a ListItem is another List, the prepend logic should preserve the indentation of the leader rather than treating it as a continuation line.

Of course one workaround is to carefully format your lists so that the prepend spacing ends up being 4:

1.  A (content is padded out with an extra space)
    - B (so the indented leader ends up with 4 spaces!)

But I am working with a boatload of legacy markdown docs and would appreciate a solution that was a bit more robust to spacing between the leader and content. Any ideas for quick monkeypatches or other workarounds would be greatly appreciated! Thanks so much for the wonderful open source project!

pbodnar commented 5 months ago

Hi @masalim2, thanks for the report/request. The possibility of preserving the indentation was already partly (but shortly) discussed at #197 (which was about preserving spaces after the leader).

I agree it might be a little bit tricky to keep the original indentation. And we should also probably look at the indentation not just within lists. Currently, the MarkdownRenderer firstly renders the nested contents and then indents the resulting lines in a uniform fashion (in prefix_lines()), like here:

    def render_list_item(
        self, token: block_token.ListItem, max_line_length: int
    ) -> Iterable[str]:
        indentation = len(token.leader) + 1 if self.normalize_whitespace else token.prepend - token.indentation
        max_child_line_length = (
            max_line_length - indentation if max_line_length else None
        )
        lines = self.blocks_to_lines(
            token.children, max_line_length=max_child_line_length
        )
        return self.prefix_lines(
            list(lines) or [""],
            token.leader + " " * (indentation - len(token.leader)),
            " " * indentation
        )

Without digging deeper, I cannot currently come with a "quick monkeypatch". But maybe @anderskaplan, the main author of the MarkdownRenderer, could come up with something? :)

nijel commented 5 months ago

https://github.com/miyuchina/mistletoe/pull/215 attempts to address this (discovered via https://github.com/translate/translate/issues/5237).

nijel commented 3 months ago

@pbodnar Any plans on releasing 1.4.0? Can I help with that somehow? I'd like to see this fixed in a released version.

pbodnar commented 3 months ago

@nijel, I'd like to release 1.4.0 soon. If not this weekend, then the next one. Just to fix one build configuration issue and update the benchmarks before releasing it...

pbodnar commented 2 months ago

Released. :)