facelessuser / pyspelling

Spell checker automation tool
https://facelessuser.github.io/pyspelling/
MIT License
80 stars 21 forks source link

pyspelling markdown filter is not processing <details> & <summary> tag blocks in GitHub markdown files. #192

Open jrzyshr opened 3 months ago

jrzyshr commented 3 months ago

I recently started adding collapsible sections to the Markdown files in my repo to hide certain sections as per the GitHub docs: https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/organizing-information-with-collapsed-sections

Markdown sample from the GitHub docs link above:

<details>
<summary>Tips for collapsed sections</summary>

### You can add a header

You can add text within a collapsed section. 

You can add an image or a code block, too.

```ruby
   puts "Hello World"



This led to an issue where GitHub pages was not rendering the HTML properly.  Everything in the Markdown file beyond the first details tag was rendered as the original Markdown in plain unformatted text.   I was able to correct this and fix it based on this answer deep in another repo's issue: https://github.com/gettalong/kramdown/issues/155#issuecomment-1024896918

At the same time GH Pages weren't rendering, I observed that pyspelling appeared to be evaluating all text beyond the first details tag in the Markdown file for spelling issues.  While the fix above resolved the rendering issue for GitHub Pages, it has not resolved it for pyspelling. 

I understand that "pyspelling.filters.markdown" converts Markdown to HTML, then passes the HTML content off to pyspelling.  I suspect the "pyspelling.filters.markdown" filter is also not processing the Markdown content after the first details tag in a Markdown file.

Here is a specific example from my repo where PySpelling is not working properly: https://github.com/microsoft/WhatTheHack/blob/master/015-Serverless/Student/Challenge-01.md#setup-local-workstation

Any suggestions on how to resolve this issue? 
facelessuser commented 3 months ago

It is unclear to me how you've configured pyspelling or its related filters. Please provide a minimal, reproducible example.

jrzyshr commented 3 months ago

Here's the Pyspelling config on our repo: https://github.com/microsoft/WhatTheHack/blob/master/.github/workflows/spell-check/spellcheck.yml

jrzyshr commented 3 months ago

Here's a minimal sample Markdown that will cause pyspelling to flag errors where there shouldn't be:

If you want to set up the developer environment on your local workstation, expand the section below and follow the requirements listed.
<details markdown=1>
<summary markdown="span"><strong>Click to expand/collapse Hidable Section</strong></summary>

Some content
`code snippet foobar`
[link](https://url.notarealword.com)
More content
</details>

In the sample MD snippet above, "notarealword" from the URL, and "foobar" from the codesnippet are flagged by pyspelling. I assume it's because the pyspelling markdown filter is not rendering content after the details tag to HTML, and thus the pre, code, and a (href) tags are not being rendered in the HTML output that is being sent to the spell checker, so the spell checker treats them like any other text.

facelessuser commented 3 months ago

So, the current Markdown filter is using Python Markdown. Python Markdown will not process "Markdown content" in a block HTMl tag unless md_in_html extension is included, which I assume was your intention as you've added markdown=1, but I do not see that extension included in your config; therefore, no code blocks are rendered inside of <details> nor any links.

jrzyshr commented 3 months ago

Thank YOU! That was the fix! I guess I was only looking at your extensions (https://facelessuser.github.io/pymdown-extensions/extensions/), not the extensions for Python Markdown themselves.

I'm happy to close the issue, but not sure if it's worth calling this out for others who may stumble upon the same scenario?

facelessuser commented 3 months ago

It wouldn't hurt to mention in the filter documentation that people can use the built-in markdown extensions or others.