Open felixbur opened 1 year ago
At the moment we test only for older Python vesions:
Maybe we should also update this.
This problem is expected to occur if you pass arguments to openSMILE with non-ASCII characters, e.g. file paths containing special characters. openSMILE does not have official support for UTF-8 in config files, at least we have not tested what happens when you use UTF-8. If you're lucky, it might work on some platforms out-of-the-box but this is something we cannot guarantee. So the recommendation from my side would be to ensure there are no special characters in file paths and openSMILE options. And in config files, there should never be the need for special characters anyway.
openSMILE does not have official support for UTF-8 in config files, at least we have not tested what happens when you use UTF-8
Should we still switch to UTF-8 encoding in SMILEapi.py
, as it seems to solve the issue at least in some cases?
I wouldn't because it will be harder to debug issues due to it when the error occurs at another point with a possibly unrelated error message. In the best case, you would get the error that a file couldn't be found and you might figure out it's due to special characters in the path.
If you're lucky, it might work on some platforms out-of-the-box but this is something we cannot guarantee
This sounds not like a nice behavior to me, so maybe we should just raise an error if an argument (whatever argument means here ;) ) contains non-ASCII?
Or maybe fix openSMILE to support non-ASCII. We are in the year 2022 :)
This sounds not like a nice behavior to me, so maybe we should just raise an error if an argument (whatever argument means here ;) ) contains non-ASCII?
Yes, this is what happens at the moment. Any inputs with non-ASCII characters will throw the exception reported by @felixbur. I was referring to if we just change "ascii" to "utf-8", then you might be lucky that it will work but we don't know for sure without checking the code.
This sounds not like a nice behavior to me, so maybe we should just raise an error if an argument (whatever argument means here ;) ) contains non-ASCII?
Yes, this is what happens at the moment. Any inputs with non-ASCII characters will throw the exception reported by @felixbur. I was referring to if we just change "ascii" to "utf-8", then you might be lucky that it will work but we don't know for sure without checking the code.
Ah, ok, but then we should update the error message. At least I'm not able to understand what is going wrong when seeing:
map(lambda v: bytes(str(v), "ascii"), sum(options.items(), ())))
Getting an lib/python3.9/site-packages/opensmile/core/SMILEapi.py", line 237, in
map(lambda v: bytes(str(v), "ascii"), sum(options.items(), ())))
UnicodeEncodeError: 'ascii' codec can't encode character '\u0308' in position 58: ordinal not in range(128)
Happened with python3.9 under Mac OS
To solve the error, specify the correct encoding, e.g. utf-8