audeering / opensmile-python

Python package for openSMILE
https://audeering.github.io/opensmile-python/
Other
243 stars 32 forks source link

UnicodeEncodeError #63

Open felixbur opened 1 year ago

felixbur commented 1 year ago

Getting an lib/python3.9/site-packages/opensmile/core/SMILEapi.py", line 237, in map(lambda v: bytes(str(v), "ascii"), sum(options.items(), ()))) UnicodeEncodeError: 'ascii' codec can't encode character '\u0308' in position 58: ordinal not in range(128)

Happened with python3.9 under Mac OS

To solve the error, specify the correct encoding, e.g. utf-8

hagenw commented 1 year ago

At the moment we test only for older Python vesions: image

Maybe we should also update this.

chausner-audeering commented 1 year ago

This problem is expected to occur if you pass arguments to openSMILE with non-ASCII characters, e.g. file paths containing special characters. openSMILE does not have official support for UTF-8 in config files, at least we have not tested what happens when you use UTF-8. If you're lucky, it might work on some platforms out-of-the-box but this is something we cannot guarantee. So the recommendation from my side would be to ensure there are no special characters in file paths and openSMILE options. And in config files, there should never be the need for special characters anyway.

frankenjoe commented 1 year ago

openSMILE does not have official support for UTF-8 in config files, at least we have not tested what happens when you use UTF-8

Should we still switch to UTF-8 encoding in SMILEapi.py, as it seems to solve the issue at least in some cases?

chausner-audeering commented 1 year ago

I wouldn't because it will be harder to debug issues due to it when the error occurs at another point with a possibly unrelated error message. In the best case, you would get the error that a file couldn't be found and you might figure out it's due to special characters in the path.

hagenw commented 1 year ago

If you're lucky, it might work on some platforms out-of-the-box but this is something we cannot guarantee

This sounds not like a nice behavior to me, so maybe we should just raise an error if an argument (whatever argument means here ;) ) contains non-ASCII?

frankenjoe commented 1 year ago

Or maybe fix openSMILE to support non-ASCII. We are in the year 2022 :)

chausner-audeering commented 1 year ago

This sounds not like a nice behavior to me, so maybe we should just raise an error if an argument (whatever argument means here ;) ) contains non-ASCII?

Yes, this is what happens at the moment. Any inputs with non-ASCII characters will throw the exception reported by @felixbur. I was referring to if we just change "ascii" to "utf-8", then you might be lucky that it will work but we don't know for sure without checking the code.

hagenw commented 1 year ago

This sounds not like a nice behavior to me, so maybe we should just raise an error if an argument (whatever argument means here ;) ) contains non-ASCII?

Yes, this is what happens at the moment. Any inputs with non-ASCII characters will throw the exception reported by @felixbur. I was referring to if we just change "ascii" to "utf-8", then you might be lucky that it will work but we don't know for sure without checking the code.

Ah, ok, but then we should update the error message. At least I'm not able to understand what is going wrong when seeing:

map(lambda v: bytes(str(v), "ascii"), sum(options.items(), ())))