website-scraper / website-scraper-puppeteer

Plugin for website-scraper which returns html for dynamic websites using puppeteer
MIT License
324 stars 80 forks source link

fix unicode error #103

Open azlarsin opened 6 months ago

azlarsin commented 6 months ago

Source: https://github.com/website-scraper/node-website-scraper?tab=readme-ov-file#afterresponse.

a binary string. This is advised against because of the binary assumption being made can foul up saving of utf8 responses to the filesystem.

Test page: http://www.csxykj.com/mobile/index.html

Before: <h3>磁致伸缩位移�&nbsp;感器</h3>, <h5>影响大跨度桥梁施工控制的�&nbsp;�&nbsp;</h5>.

After: <h3>磁致伸缩位移传感器</h3>, <h5>影响大跨度桥梁施工控制的因素</h5>.