facelessuser / pymdown-extensions

Extensions for Python Markdown
https://facelessuser.github.io/pymdown-extensions/
Other
949 stars 253 forks source link

"striphtml" does remove the excerpt in blogs #2427

Closed weyCC81 closed 1 month ago

weyCC81 commented 1 month ago

Description

Currently, the extension "pymdownx.striphtml" impacts the Blog Module by showing the full content in the overview (/blog/index.html) instead of just a snipped. Can this be changed?

The following line should not be removed/skipped in .md files

<!-- more -->

Output without active striphtml (snippet is show + link):

# EN
<nav class="md-post__action">
   <a href="2024/09/02/[name of post].html">Read more</a>
</nav>
# DE
<nav class="md-post__action">
   <a href="2024/09/02/[name of post].html">Weiterlesen</a>
</nav>

Output active striphtml (the whole article is shown instead):

<nav> is missing and more content is provided on the main page as nessesary/planned

Source: https://squidfunk.github.io/mkdocs-material/tutorials/blogs/basic/?h=blog

Minimal Reproduction

  1. Enable the Plugin "blog" in mkdocs.yaml
  2. Add <!-- more --> in .md file of blog post > This should just show the introduction afterward
  3. Build Site (mkdocs serve or mkdocs build)
  4. Verify output (snipped/cover text should only be shown before opening the full plog post/article)

Version(s) & System Info

facelessuser commented 1 month ago
import markdown

MD = """
<!-- more -->

# EN
<nav class="md-post__action">
   <a href="2024/09/02/[name of post].html">Read more</a>
</nav>
# DE
<nav class="md-post__action">
   <a href="2024/09/02/[name of post].html">Weiterlesen</a>
</nav>
"""

print(markdown.markdown(MD, extensions=['pymdownx.striphtml']))
<h1>EN</h1>
<nav class="md-post__action">
   <a href="2024/09/02/[name of post].html">Read more</a>
</nav>
<h1>DE</h1>
<nav class="md-post__action">
   <a href="2024/09/02/[name of post].html">Weiterlesen</a>
</nav>

As you can see above, things seem to be working just fine.

If you'd like me to investigate further, please provide a minimal, reproducible example, not a link to the MkDocs Material source. Material provides instructions as to how to create such a minimal, reproducible example, please use it and provide it here.

@gir-bot add S: more-info-needed

weyCC81 commented 1 month ago

I see the explanation was not clear enough...

Behavior with the Blogs Module and "striphtml" disabled: grafik

Current Behavior with Blog and "striphtml" enabled: grafik

Summarized I would like to skip that <!-- more --> is getting striped, but all other <!-- something --> should be removed.

PS: I have also updated my first post

facelessuser commented 1 month ago

You've filed this as a bug, but you seem to be suggesting new functionality.

I think you may have misunderstood the intentions of StripHTML It is meant to provide simple strip functionality as described in the documentation. It does not have the intelligence to parse a given comment and know that only the next HTML section afterward is supposed to be stripped. Such functionality does not exist and is not planned.

If you are interested in such functionality, you may consider multi-language plugins for MkDocs or construct your own MkDocs plugin that can walk the HTML node structure and provide advanced control over which nodes you'd like to remove.

weyCC81 commented 1 month ago

Correct, this could also be an Option instead of a native integration, as the Blog module is not always enabled in mkdocs-material.

Option Type Default Description
strip_more_command_in_blog bool True Strip of the String <!-- more --> is getting excluded

Then I am probably wrong, thought if <!-- more --> is not getting stripped the function of this command in the blog module could work again...

It does not have the intelligence to parse a given comment and know that only the next HTML section afterward is supposed to be stripped. Such functionality does not exist and is not planned.

I believe this is already integrated in mkdocs-material, but not working because too much is getting stripped

facelessuser commented 1 month ago

This request is so hyper-specific to Mkdocs Material. Keep in mind these are Python Markdown extensions, not MkDocs extensions, and definitely not MkDocs Material extensions. All extensions are meant to be generic, for use anywhere. If we were to do anything like this, it would need to be generic, like letting the user define specific comments to ignore.

In general, I would argue if you have comments that have importance, comment stripping may not be something you want enabled.

facelessuser commented 1 month ago

Currently, we have no plans to enhance comment stripping logic with advanced user created rules. Our current advise is to use strip_comments and set it to False if you'd like to keep other stripping features enabled but would like to avoid comments.

weyCC81 commented 1 month ago

Thanks for your responses and for outlining the scope of your project. I also find it problematic that they use <!-- more --> as a command within the blog module. Nonetheless, I'd like to strip comments before publishing blog posts, and it seems that using this module is the simplest solution. I've attached a draft that might help others facing the same issue with MkDocs Material extensions and this module. Thank you for your contributions, both past and future!

striphtml.zip