StractOrg / stract

web search done right
https://stract.com
GNU Affero General Public License v3.0
1.94k stars 43 forks source link

Display issue with non-Unicode encoded results #137

Closed nclm closed 4 months ago

nclm commented 4 months ago

Noticed that the search preview of pages extract that are not Unicode (at least one that is in windows-1252) doesn’t display right:

Screenshot 2024-02-09 at 10 43 50

There might be away to detect the page encoding and get the preview right?

mikkeldenker commented 4 months ago

Thanks for the report! It was indeed a mistake in the way we decoded non-utf8 encoded strings. It should be fixed now, but it won't be visible in production until we rebuild the index.