mixmark-io / turndown

🛏 An HTML to Markdown converter written in JavaScript
https://mixmark-io.github.io/turndown
MIT License
8.52k stars 864 forks source link

Support pasting code blocks from VS Code #468

Closed yuri2peter closed 1 month ago

yuri2peter commented 1 month ago

Currently, Turndown does not correctly parse code blocks copied from VS Code.

pavelhoral commented 1 month ago

Can you be more specific? Ideally with an example.

What does it mean "code blocks copied from VS Code"... Copied where?

yuri2peter commented 1 month ago

When copying code from vscode, the obtained HTML content is as follows:

<html>

<body>
  <!--StartFragment-->
  <div
    style="color: #cccccc;background-color: #1f1f1f;font-family: Consolas, 'Courier New', monospace;font-weight: normal;font-size: 13px;line-height: 18px;white-space: pre;">
    <div><span style="color: #c586c0;">export</span><span style="color: #cccccc;"> </span><span
        style="color: #569cd6;">function</span><span style="color: #cccccc;"> </span><span
        style="color: #dcdcaa;">alertSelectionIsEmpty</span><span style="color: #cccccc;">() {</span></div>
    <div><span style="color: #cccccc;">&#160; </span><span style="color: #569cd6;">const</span><span
        style="color: #cccccc;"> </span><span style="color: #4fc1ff;">str</span><span style="color: #cccccc;">
      </span><span style="color: #d4d4d4;">=</span><span style="color: #cccccc;"> </span><span
        style="color: #ce9178;">'Please select some text first.'</span><span style="color: #cccccc;">;</span></div>
    <div><span style="color: #cccccc;">&#160; </span><span style="color: #4fc1ff;">notifications</span><span
        style="color: #cccccc;">.</span><span style="color: #dcdcaa;">show</span><span style="color: #cccccc;">({</span>
    </div>
    <div><span style="color: #cccccc;">&#160; &#160; </span><span style="color: #9cdcfe;">title</span><span
        style="color: #9cdcfe;">:</span><span style="color: #cccccc;"> </span><span
        style="color: #ce9178;">'Error'</span><span style="color: #cccccc;">,</span></div>
    <div><span style="color: #cccccc;">&#160; &#160; </span><span style="color: #9cdcfe;">message</span><span
        style="color: #9cdcfe;">:</span><span style="color: #cccccc;"> </span><span
        style="color: #4fc1ff;">str</span><span style="color: #cccccc;">,</span></div>
    <div><span style="color: #cccccc;">&#160; &#160; </span><span style="color: #9cdcfe;">color</span><span
        style="color: #9cdcfe;">:</span><span style="color: #cccccc;"> </span><span
        style="color: #ce9178;">'red'</span><span style="color: #cccccc;">,</span></div>
    <div><span style="color: #cccccc;">&#160; });</span></div>
    <div><span style="color: #cccccc;">}</span></div>
  </div><!--EndFragment-->
</body>

</html>

It cannot be correctly parsed as a code block by turndown.

Given the widespread use of editors like VSCode, should we incorporate this support to more effectively integrate Turndown into text editor components?

In my own project application, I solved this problem through some hacky code.

pavelhoral commented 1 month ago

Thanks for the explanation. This is formatted code block and it does not make sense to support such thing with turndown.

yuri2peter commented 1 month ago

You're right, maybe this should be the job of editor developers.