Closed talvera98 closed 3 years ago
Hi @talvera98,
looks like the encoding of your text files is still broken. Is the original source of the text files publicly available?
This issue iby @talvera98 is real. I also tried several texts and it returned the error above. @severinsimmler @mromanello @reckart @thvitt
Please make sure your text files are UTF-8 encoded, @tobimichigan.
@severinsimmler , I later ran(python topicsexplorer.py --browser) with pipenv and it ran successfully. It wasn't about UTF8 encoded.
Hello,
I'm working at a project where we want to use digital methods to analyse a corpus of German texts.
Unfortunately, I cannot upload the corpus to Dariah. If I upload any of the prepared corpora available on TextGrid Repository, there is no problem; the program runs normally and shows the results. So it has to be a problem with my corpus, I guess. But the corpus is a plain text file and on Dariah it says it's possible to work with any of them. A professor at my university suggested to change the coding of the text file from Windows to Unicode UTF-8. I tried that but it didn't change anything; the Topics Explorer still doesn't accept the file.
I attach a screenshot of the error message and the two text files I tried it with (In Windows coding).
Does anyboy have an idea what the reason for the problem is? I'd be very grateful if you could take a look at my files, maybe even run them through the Topics Explorer, and could suggest what change I have to make. Thanks a lot! Korpus Ev. Texte Kirche in 1Live 2019.txt Korpus Kath. Texte Kirche in 1Live 2019.txt