Closed joelpurra closed 1 month ago
Hello, I can't reproduce this on my machine, but I think you have ASCII as filesystem encoding, right? Can you check with sys.getfilesystemencoding()
? Also there should be trying to set locale to and global locale set to logs on startup, what are these?
@berkanteber: requested output:
apt
, 3:20.5-4~bookworm
installed from from https://archive.raspberrypi.org/debian/)In a quick test against a single video file hosted by putio, all I can confirm is that the set of subtitles returned from the api has changed.
U+00E4 0xE4 ä ä %C3 %A4
.
iconv
example.At this time I cannot know if I have a good or "bad" test video file. If you provide a set of test files with known-good unicode subtitle filenames I can check on my side, but I suspect this is unnecessary.
If the issue (the putio addon crashing) isn't reproducible, perhaps it isn't an issue any longer? Might have been that a recent kodi version or a server-side API change (either by putio or upstream) fixed it.
I was hitting this the other day, also with UTF-8. But I'm on Kodi 19.5, so perhaps it's been fixed upstream (Kodi or LibreELEC).
I tried with renaming a video and a subtitle (.srt) file in same folder to some unicode name. \xe5
from the original report is å
, so å.mp4
and å.srt
. Also tried with some Turkish characters, which were fine as well. Checked with ä
and it is also fine for me. These all also returns fine from API.
Adding some print lines, special_path_translated
is str
. Not ASCII-encoded, and keeps Unicode characters. As far as I can tell, if str
is provided to open()
, filesystem encoding is used to encode the filename. And from the exception, it looks like it tries to encode as ASCII, so I was expecting your answers to be ASCII as well.
Here in docs, it says:
Today Python is converging on using UTF-8: Python on MacOS has used UTF-8 for several versions, and Python 3.6 switched to using UTF-8 on Windows as well. On Unix systems, there will only be a filesystem encoding. if you’ve set the LANG or LC_CTYPE environment variables; if you haven’t, the default encoding is again UTF-8.
If the issue (the putio addon crashing) isn't reproducible, perhaps it isn't an issue any longer? Might have been that a recent kodi version or a server-side API change (either by putio or upstream) fixed it.
I can't say that because I've never reproduced it. If you still experience it (preferably in the latest Kodi) and the PR fixes it, we can merge it.
open()
and they could get the same error. But changing all arguments to bytes doesn't sound nice (also docs tell to prefer string).There is this issue I found where it looks like they played with locale. But I don't know which versions it affects.
Can you both try with the latest Kodi?
If you don't have a file to reproduce:
åä.mkv
or åä.mp4
.åä.srt
.And you can test this PR with:
plugin.video.putio
.Upgrading to Kodi 21.1 made my filename work. That also triggered an upgrade from addon 3.0.0 to 3.0.2.
Thanks for looking into it, and sorry I didn't think of trying an upgrade first.
Quoting @berkanteber:
Can you both try with the latest Kodi?
Am using Kodi v20.5.0, which seems to be the most recent for Debian 12 on Raspberry Pi.
If you don't have a file to reproduce:
- Take any video file and rename it to
åä.mkv
oråä.mp4
.- Take any subtitle and rename it to
åä.srt
.- Put them in the same folder.
Unicode characters in the "local" putio subtitle filename work correctly both with and without .encode()
. No issues, and in particular no crashes.
Edit: there might be some naming issue also in this case. The subtitle file has the correct name on disk, but shows up as "Unknown (External)" in kodi's video player/subtitle selector.
åä.*
" test above covers the case where the subtitle filename comes from a "local" file in putio.
åä.srt
may show up as "unknown (external)" subtitle in kodi's video player..srt
file (subtitles only provided via the putio api).
Note that I cannot currently test the upstream case, since the putio api doesn't seem to return unicode subtitle filenames for this test file anymore. (The "crashing" unicode subtitle filenames are all gone, which seems like a hack on some api level, so cannot reproduce.) Would still need known-good test files, with known-good "non-local"/"upstream" unicode subtitle filenames for this, which I do not have.
The addon crashing issue has gone away, but so have perhaps some previously available "international" subtitles ;)
There are 3 subtitle sources: OpenSubtitles, folder (test case above), file (subtitle in video file). If local works fine, that means our API handles names well. There may be some changes on OpenSubtitles side as you say, but there aren't any recent changes from our side. (We've migrated to their new API but it has been over a year.)
I'll try to find some of the other test cases. Unknown
is probably the language, it's
unknown for all folder subtitles. I'll check this too.
I think we can close this issue since upgrading solved the original crash.
Sorry for the late return btw, haven't noticed it.
@berkanteber: yes, unicode-related crashes are no longer reproducible in upgraded Kodi. Closing this pull request.
It would be nice to separately confirm and fix potential subtitle issues mentioned here; happy to see that you are already looking into it. Thanks!
It would be nice to separately confirm and fix potential subtitle issues mentioned here; happy to see that you are already looking into it. Thanks!
3.1.0 has just been approved (for Nexus and above). This should improve subtitle language identification.
@berkanteber: found a subtitle (seemingly from opensubtitles) with correct unicode handling. While it's only a single test case, it seems to indicate that unicode support is working correctly also in kodi v20.5.
Fixes a crash for subtitle names with Unicode characters.