trentm / python-markdown2

markdown2: A fast and complete implementation of Markdown in Python
Other
2.64k stars 431 forks source link

Conversion Issue with Single Line Breaks in Tables #557

Closed syntaxsurge closed 8 months ago

syntaxsurge commented 8 months ago

Hello markdown2 team,

I've encountered an issue with the conversion of markdown tables to HTML, particularly when they include single line breaks. Below are some sample inputs to illustrate the problem:

Sample Input 1:

import markdown2

markdown_outline = """
OpenAI's Growth Trajectory:
| Version | Parameters | Abilities                       |
|---------|------------|---------------------------------|
| GPT     | 117M       | Basic understanding of language |
| GPT-2   | 1.5B       | More nuanced language processing|
| GPT-3   | 175B       | Highly advanced AI capabilities |
"""

html_outline = markdown2.markdown(
    markdown_outline,
    extras=['tables', 'footnotes', 'markdown-in-html', 'cuddled-lists']
)

print(html_outline)

Sample Input 2:

import markdown2

markdown_outline = """
<p>Here's a look at some <strong>real-world achievements</strong> of OpenAI models:
| OpenAI Model Version | Superpower                                           | Real-world Application                                 |
|----------------------|------------------------------------------------------|--------------------------------------------------------|
| GPT-3                | Human-like text generation                           | Advanced chatbots, Copywriting tools, Personal assistants |
| DALL-E               | Image creation from text descriptions                | Branding materials, Concept art, Product design          |
| Codex                | Understanding and generating computer code           | Developer tools, Education platforms, Automating tasks    |
| GPT-2                | Text generation and translation                      | Content creation, Language translation services           |</p>
"""

html_outline = markdown2.markdown(
    markdown_outline,
    extras=['tables', 'footnotes', 'markdown-in-html', 'cuddled-lists']
)

print(html_outline)

Sample Input 3:

import markdown2

markdown_outline = """
<p><strong>OpenAI's Growth Trajectory:</strong>
| Version | Parameters | Abilities                       |
|---------|------------|---------------------------------|
| GPT     | 117M       | Basic understanding of language |
| GPT-2   | 1.5B       | More nuanced language processing|
| GPT-3   | 175B       | Highly advanced AI capabilities |</p>
"""

html_outline = markdown2.markdown(
    markdown_outline,
    extras=['tables', 'footnotes', 'markdown-in-html', 'cuddled-lists']
)

print(html_outline)

In each of these samples, the conversion of tables seems to be inconsistent or incorrect when single line breaks are used within tables or paragraphs containing tables. Could you please look into this issue?

Thanks for your assistance!

Crozzers commented 8 months ago

Did some digging, looks like this was intentional behaviour from the original implementation, which appears to be based on GFM. Given that GFM now allows tables cuddled to the previous paragraph, I don't see why not to allow this behaviour.