dharple / detox

Tames problematic filenames
BSD 3-Clause "New" or "Revised" License
332 stars 19 forks source link

How can I delete certain characters instead of replacing them? #87

Closed dave-kennedy closed 2 years ago

dave-kennedy commented 3 years ago

If I want to just remove certain characters, how would I go about that? The included tables only map characters to other characters.

delphym commented 3 years ago

I believe the answer for this is well described and explained in HACKING-v1.md.

dave-kennedy commented 3 years ago

Not exactly, but I eventually realized the replacement can be an empty string. Now I have this:

#
# Chars to translate to ' - '
#

0x2f        ' - '   # /
0x3a        ' - '   # :
0x3b        ' - '   # ;
0x28        ' - '   # (
0x3c        ' - '   # <
0x5b        ' - '   # [
0x7b        ' - '   # {

#
# Chars to delete
#

0x29        ''  # )
0x3e        ''  # >
0x5d        ''  # ]
0x7d        ''  # }

Which changes Aladdin Sane (1913–1938–197?) to Aladdin Sane - 1913-1938-197_. It's a bit odd that the trailing underscore is left because I have remove_trailing in my config, but I suppose that's another issue.

Edit: it seems the same trick doesn't work with default. e.g., if I set it to an empty string I end up with Aladdin Sane - 1913-1938-197defaultdefault. Is that expected?

dharple commented 2 years ago

Yeah, the remove_trailing is a bit confusing. If there was an extension on the filename, it would remove the trailing _. For instance, blah_.jpg would become blah.jpg, but it won't remove a trailing underscore without an extension.

I'll add a separate ticket to change that behavior in detox 2. I agree that the trailing underscore should be removed in your example.

Also, the defaultdefault thing is a bug. I'll add a second issue for that.

dharple commented 2 years ago

Also, thank you for the help @delphym