YGGverse / YGGo

YGGo! Distributed Web Search Engine
MIT License
14 stars 3 forks source link

UTF-8 normalization #11

Open d47081 opened 1 year ago

d47081 commented 1 year ago

Some pages have unsupported character set that causes DB error in crawler

it have been temporarily fixed but requires proper solution (without db character set changing),

so I found useful library that could solve this problem https://github.com/neitanod/forceutf8