Alir3z4 / html2text

Convert HTML to Markdown-formatted text.
alir3z4.github.io/html2text/
GNU General Public License v3.0
1.76k stars 270 forks source link

colspan is entirely ignored, which results in misformated tables #367

Open johnkw opened 3 years ago

johnkw commented 3 years ago

For example this test: echo '<table><tr><th colspan="2">A</th><th>B</th></tr><tr><td>1</td><td>2</td><td>3</td></tr></table>' | html2text --pad-tables

Output is currently weirdly mangled with incorrect columns:

A   | B
----|----
1   | 2  | 3

Today I learned: markdown bizarrely has no defined way of dealing with colspan or rowspan. The correct syntax is undefined here, but adding a blank dummy cell seems to be the best option at the moment, so the output should be:

 A |   | B 
---|---|---
 1 | 2 | 3 

Or as shown here in github's implementation of markdown:

A B
1 2 3