dharple / detox

Tames problematic filenames
BSD 3-Clause "New" or "Revised" License
318 stars 19 forks source link

unsupported unicode length #108

Open josh-aliencode opened 11 months ago

josh-aliencode commented 11 months ago

When I run the command: detox -rv * on my files.. many of them don't get renamed, and outputs error: unsupported unicode length I assume the obvious, that this is because the character length is too much. Is there any way to force this or bypass it? Or even display the conflicting file?

dharple commented 5 months ago

What version of detox are you using? You can get it quickly by running:

detox -V

dharple commented 5 months ago

Actually, it's possible that the filenames aren't in UTF-8 at all. You can try detox -n -s iso8859_1 FILE or, if you're using detox 2, detox -n -s iso8859_1-legacy FILE, to see if either yields better results.

josh-aliencode commented 5 months ago

I figured out the reason was because there was an apostrophe in some names.. will that particular detox command fix that without having to manually rename?

dharple commented 5 months ago

It really depends on what the underlying bytes of the apostrophe are. If it's just a normal apostrophe, it shouldn't cause the error you're seeing. If it's a CP-1252 or upper ISO-8859-n apostrophe, then you might get the error you were describing.

The detox -n ... commands above will do a dry run, so you can see what would happen without it actually changing anything.

One other option to help solve this is to use hexdump -C to see what the actual bytes are.