JohannesKaufmann / html-to-markdown

⚙️ Convert HTML to Markdown. Even works with entire websites and can be extended through rules.
MIT License
891 stars 85 forks source link

🐛 Bug Consecutive <span> missing spaces #78

Closed ilovesusu closed 2 months ago

ilovesusu commented 1 year ago

Describe the bug Consecutive \<span> missing spaces like

import"fmt"
fortrue

missing spaces!!!!

HTML Input

<div class="example_code">
<span style="color: #b1b100; font-weight: bold;">package</span> main<br>
<br>
<span style="color: #b1b100; font-weight: bold;">import</span> <span style="color: #cc66cc;">"fmt"</span><br>
<br>
<span style="color: #993333;">func</span> main<span style="color: #339933;">()</span> <span style="color: #339933;">{</span><br>
&nbsp; &nbsp; <span style="color: #b1b100; font-weight: bold;">for</span> <span style="color: #000000; font-weight: bold;">true</span> &nbsp;<span style="color: #339933;">{</span><br>
&nbsp; &nbsp; &nbsp; &nbsp; fmt<span style="color: #339933;">.</span>Printf<span style="color: #339933;">(</span><span style="color: #cc66cc;">"xxxxx。<span style="color: #000099; font-weight: bold;">\n</span>"</span><span style="color: #339933;">);</span><br>
&nbsp; &nbsp; <span style="color: #339933;">}</span><br>
<span style="color: #339933;">}</span><br>
</div>

Generated Markdown

package main

import"fmt"

func main(){

fortrue{

        fmt.Printf("xxxxx。\n");

}

}

Expected Markdown

package main

import "fmt"

func main(){

        for true{

                fmt.Printf("xxxxx。\n");

        }
}
JohannesKaufmann commented 1 year ago

Thanks for reporting this bug! Unfortunately thats a difficult one to fix...

JohannesKaufmann commented 1 year ago

Thanks again for reporting this bug!

Turns out the V2 of this library (that I am currently working on) also had this problem. But after another major rewrite of V2 it finally works:

package main

import "fmt"

func main() {
    for true  {
        fmt.Printf("xxxxx。\\n");
    }
}

Thanks for letting me know! That was really helpful taking that into consideration!

The V2 is not ready to be published though. That is going to take more time... So I am keeping this issue open until then.

JohannesKaufmann commented 2 months ago

On the "v2" branch are a lot of improvements — including a fix to this bug.

It is still experimental but feel free to give it a try. Happy to hear about your experience 😊

I am going to close this issue. If you find anything with the new version, please open a new issue!