mailpile / moggie

A free & open modern, fast email client with user-friendly encryption and privacy features
https://mailpile.is
Other
73 stars 1 forks source link

Add table support to Moggie's HTML-to-text conversion #5

Open BjarniRunar opened 1 year ago

BjarniRunar commented 1 year ago

Moggie's HTML-to-text conversion in moggie.security.html.HTMLToTextCleaner is used for generating most plain-text e-mail views, and is preferred by default over the text parts including in the e-mails themselves simply because so many systems generate broken or incomplete text parts these days.

The class does a decent job generating readable text from HTML input, including links, images and tags such as pre, blockquote and ul - but the structure implied by a table is currently ignored.

This means we are losing some important information from e-mails sent by financial institutions, travel itineraries, and probably some others. (This isn't just layout for marketing messages!)

So we should support tables!

Currently the code inherits from HTMLCleaner a depth-first algorithm which converts each tag into text in the rerender_tag method. This needs to change - the depth-first code will need to buffer the table contents (including tables within tables) and postpone processing them until the size/structure of the table is known, allowing us to allocate a width to each table column, and then render cells side-by-side in the plain text.