Open selmling opened 2 years ago
Hi Steven,
I think I meant windows iso charset at the time (https://nl.wikipedia.org/wiki/ISO_8859-1).
Alternatively you can check whether file
is able to determine the codec. If so, you can also use iconv
to convert the codec before processing the file. However, if CHAT added the UTF8 header it is strange that it is not UTF8. To be honest, I wrote this by need and after I was done importing the CHAT files I never looked at it again.
I haven't heard of gitter. I'm fine with opening it if there is enthusiasm.
@selmling this might have to do with the python version you are using. I find this function useful as well, but have needed to port it from python 2 to python 3, since it appears to be written for python 2 when the language used bytestrings. I would be happy to make a pull request with my solution @dopefishh.
Thanks, pull requests are always very much welcome.
I'd like to be able to batch convert
.cha
files to.eaf
format using your wonderful library, pympi. I've used pympi for other purposes with great success, but I'm having trouble getting it to interact with.cha
files.When I call the
pympi.Elan.eaf_from_chat
function, it hangs on the line where it checks the utf8 codec and continues.In your documentation, you mention using older codecs for older files -- any help on how to track down the codec if that information isn't readily available? This may help me debug
eaf_from_chat
. The chat files I'm working with do have @UTF8 on line 1.Also, have you considered opening up a gitter forum for your library? That would be a helpful place for folks to share code, generally easing the learning curve of using pympi, which is a really great tool!
Thank you for your work on this library! -Steven