zopieux / py-gfm

Github-Flavored Markdown for Python-Markdown.
https://pypi.org/project/py-gfm/
BSD 3-Clause "New" or "Revised" License
71 stars 14 forks source link

py-gfm creates "invalid" code markup. <code></code> is missing. #7

Closed santiagobasulto closed 4 years ago

santiagobasulto commented 8 years ago

According to the HTML5 spec the correct way to markup code blocks is with the <code> tag. Markdown correctly respects that, as you can see in this example:

source = """
```python
print("Hello World")
```"""
markdown.markdown(source)
# <p><code>python\nprint("Hello World")</code></p>

But py-gfm doesnt. Example using same source:

markdown.markdown(source,  extensions=['gfm'])

Produces this output:

<div class="highlight"><pre><span></span><span class="k">print</span><span class="p">(</span><span class="s2">&quot;Hello World&quot;</span><span class="p">)</span>\n</pre></div>
zopieux commented 7 years ago

Please see this (oldish and still open) pull request from Pygments. The builtin codehilite extension uses pygments when available, and pygments' default HTML formater adds no <code> tag – according to the linked PR, this is for CSS theming reasons.

There is actually this pygments stub (search for _wrap_code, this doc has zero anchors 😠) to add a <code> block to the HTML formater. Alas, codehilite gives no control over the instanciation of the pygments formater, so it's not possible to provide this kind of custom subclass.

As a result, this behavior has actually nothing to do with py-gfm (this stub uses no py-gfm code):

>>> source = """
    #!python
    print("Hello world")
"""

>>> markdown.markdown(source, extensions=[])
'<pre><code>#!python\nprint("Hello World")\n</code></pre>'

>>> markdown.markdown(source, extensions=[CodeHiliteExtension(use_pygments=False)])
'<pre class="codehilite"><code class="language-python linenums">print(&quot;Hello World&quot;)</code></pre>'

>>> markdown.markdown(source, extensions=[CodeHiliteExtension()])
# no <code>!
'<table class="codehilitetable"><tr><td class="linenos"><div class="linenodiv"><pre>1</pre></div></td><td class="code"><div class="codehilite"><pre><span></span><span class="k">print</span><span class="p">(</span><span class="s2">&quot;Hello World&quot;</span><span class="p">)</span>\n</pre></div>\n</td></tr></table>'

I am not sure what to do about that. Adding an option to toggle <code> would mean I have to ship a custom version of codehilite, which is overkill. For the moment I prefer to keep this behavior, which is pygments' default. Further feedback & ideas are of course welcome, ping @santiagobasulto.

stuaxo commented 5 years ago

Hi, I had a look at the linked ticket and it looks like this may have been fixed in pygments - https://bitbucket.org/birkenfeld/pygments-main/pull-requests/438/add-a-tag-to-preformatted-code-blocks-in/diff

so may be worth revisiting this ticket.

Cheers S

zopieux commented 4 years ago

I believe this is fixed in the just-released 1.0.0 revision.