seb-m / pyinotify

Monitoring filesystems events with inotify on Linux.
http://github.com/seb-m/pyinotify/wiki
MIT License
2.29k stars 379 forks source link

Handle byte sequences as paths too #183

Open StyXman opened 4 years ago

StyXman commented 4 years ago

I know this is going to be closed a EWONTFIX, but I have to try.

I do work with legacy software and migrations. I know that Unicode/UTF-8 has been the default for ages, but Latin-1 still lurks in the dark guts of old, don't-break institutions (banks, government, etc). There are gazillion of misencoded/never converted filenames, so trying to do pathname = pathname.encode(sys.getfilesystemencoding()) here leads to things like:

UnicodeEncodeError: 'utf-8' codec can't encode character '\udce1' in position 32: surrogates not allowed

Again, this is not in any way your fault, just not handling someone else's error (but, I have to point out, still a valid path).

So I'm here to ask you for a non default mode where pyinotify accepts bytes instead of str and pass that to the underlying libs untouched. In fact, for the moment I will have to fork and 'fix' this for myself, so I should be in the position of providing a patch.