deathau / markdownload

A Firefox and Google Chrome extension to clip websites and download them into a readable markdown file.
Apache License 2.0
2.91k stars 226 forks source link

Fixed unreadable code blocks. #312

Closed WetHat closed 3 months ago

WetHat commented 7 months ago

This fixes issue #191, #272, and most likely #278

The ugly flattened rendering of code blocks is caused by the removal of the <code> element from its owning <pre>. code.innerText loses all spacings and linefeeds. In addition, a second fix was needed because all tab characters \t were stripped in the turndown function (background.js). This caused code blocks to lose all indents.

This part of the fix addresses the site:

The links to Medium articles are a different story. The code in these articles are not proper code blocks. They look like this:

<pre>$ curl https://<span class="hljs-built_in">get</span>.docker.<span class="hljs-keyword">com</span> | <span class="hljs-keyword">sh</span></pre>

The code has been pre-processed, and then stuck directly under a <pre> tag. No <code> element present.

This is addressed by the second part of the fix which returns the content of <pre> elements as text rather than html. However, there is no syntax highlighting due to the missing <code> element.

This addresses:

278 is most likely fixed too, but I were not able to test this.