strohne / Facepager

Facepager was made for fetching public available data from YouTube, Twitter and other websites on the basis of APIs and webscraping.
https://github.com/strohne/Facepager/releases
510 stars 197 forks source link

Wrong text encoding #209

Closed lukasini closed 9 months ago

lukasini commented 9 months ago

I´m facing issue with encoding symbols š,č etc. when scrapping slovak pages. I guess this is limitation of SQLite that supports only ISO-8859-1. Also exporting to UTF8 change nothing. Do you have some workaround or better way to fix it?

Within application it is looking fine but inside database its encoded wrongly

1 2
strohne commented 9 months ago

Hi lukasini, I guess the encoding is correct but you need to handle it in your subsequent workflows. How do you process the data? When you export a csv file, does it come out with the right encoding?

For R, we have an experimental package that loads the database, maybe it helps you? See https://github.com/strohne/datavana/tree/main/facepager

strohne commented 9 months ago

I suggest you use the export button of Facepager, that handles the encoding :)

lukasini commented 9 months ago

Yes correct, direct export to CSV is working. Thank you for support.