I use html2text with the --pad-tables flag in my mailcap to read HTML email. Occasionally, html2text will fail when attempting to process an unholy nested table mess.
For example, the following html demonstrates the problem:
Attempting to process this with the --pad-tables flag results in an IndexError:
Traceback (most recent call last):
File "/usr/bin/html2text", line 33, in <module>
sys.exit(load_entry_point('html2text==2020.1.16', 'console_scripts', 'html2text')())
File "/usr/lib/python3.9/site-packages/html2text/cli.py", line 306, in main
sys.stdout.write(h.handle(html))
File "/usr/lib/python3.9/site-packages/html2text/__init__.py", line 146, in handle
return pad_tables_in_text(markdown)
File "/usr/lib/python3.9/site-packages/html2text/utils.py", line 273, in pad_tables_in_text
table = reformat_table(table_buffer, right_margin)
File "/usr/lib/python3.9/site-packages/html2text/utils.py", line 223, in reformat_table
max_width = [len(x.rstrip()) + right_margin for x in lines[0].split("|")]
IndexError: list index out of range
It works fine without --pad-tables. If html2text cannot figure out the padding, I would prefer it to just fall back to rendering as if --pad-tables was not given.
I use html2text with the
--pad-tables
flag in my mailcap to read HTML email. Occasionally, html2text will fail when attempting to process an unholy nested table mess.For example, the following html demonstrates the problem:
Attempting to process this with the
--pad-tables
flag results in anIndexError
:It works fine without
--pad-tables
. If html2text cannot figure out the padding, I would prefer it to just fall back to rendering as if--pad-tables
was not given.