dharple / detox

Tames problematic filenames
BSD 3-Clause "New" or "Revised" License
318 stars 19 forks source link

Is it possible to replace characters and remove punctuation (i.e. parentheses, braces etc.) in one detox command? #98

Open Ali-l opened 1 year ago

Ali-l commented 1 year ago

Hi I'm SO glad someone mentioned your tool on StackOverflow - I just wish I'd found this earlier since I spent days making a script using basename, dirname, sed etc. and while I've completed some of what I need, there is more to go. Having just found detox and having a look at the man page/pdf file thing I wasn't really sure if the following could be done in a single (or maybe couple?) of detox commands but do let me know either way.

What I'm trying to do:

Scan local directory for mkv and/or mp4 files, grab file names, replace characters with Greek / Cyrillic character set that LOOKS like English letters. So for example Fringe s03e01.mkv will become ƒгiпgє∙ѕΘЗєΘ1..мкν

I use a sed command that does this sed 'y/AaBbCcDdEefGHhIJjKklMmNnOoPpQqRrSsTtUuVvWwXxY0436 /дαввссĐđєєƒgнніjjккιммппοορρφφггѕѕттυυννωω××yΘЧЗб∙/ How would I do the same AND remove punctuation with detox?

Thanks again!

dharple commented 1 year ago

You would want to make a copy of unicode.tbl, and replace any Latin characters with what you want to replace them with.

For instance:

0x0041        "д"

You'll need to update your detoxrc to point to the new translation table (instead of using the built-in one).

More details: https://github.com/dharple/detox/blob/main/HACKING-v1.md

I've never tried this (replacing a non-Unicode character wth a Unicode one), and I have no idea if it'll work or not. I'm not in a place where I can test this right now, but I'm curious to know if it works.

You can run tests using echo "filename" | inline-detox or echo "filename" | detox --inline