rmcrackan / Libation

Libation: Liberate your Library
GNU General Public License v3.0
2.83k stars 153 forks source link

Feature request: pseudo-characters which preserve appearance as an option #285

Closed anthonynogales closed 2 years ago

anthonynogales commented 2 years ago

I noticed last night after updating to version 8.1.0.2 that downloading an audiobook with colons in the name results in the folder and file name showing colons as well. I reported this on a Reddit thread by darchangel.

..\Drew Karpyshyn - Star Wars꞉ The Old Republic꞉ Revan\Star Wars꞉ The Old Republic꞉ Revan [B005ZT0X1C].m4b

Mbucari commented 2 years ago

It's a feature, not a bug.

Illegal path characters are replaced with Unicode analogs. Replacements are as follows:

anthonynogales commented 2 years ago

I see! Nice! Now is this on by default forever or is this something we can choose to toggle on or off?

Mbucari commented 2 years ago

It's on forever. I thought everyone would appreciate it so I didn't make it optional. @rmcrackan @anthonynogales Should this be optional?

anthonynogales commented 2 years ago

I'm at the very beginning of organizing my audiobook library so it makes no difference to me. The only reason I'd rename these would be if these characters didn't play well with Plex or Audiobookshelf BUT I don't expect there will be any issues. In fact, I think it's pretty awesome! Once I finish my testing I'll probably utilize this across the board since (novice that I am) I didn't realize I could replace the illegal characters with Unicode analogs. My default has been to replace ":" with " - ".

Playing devil's advocate, the only people I see being peeved are the OCD folks who have already organized thousands of audiobooks and would have to rename the folders\files after they're downloaded or else update their existing folders to use the analogs. They're probably a minority though and if they're that OCD they could probably script a solution.

Mbucari commented 2 years ago

if these characters didn't play well with Plex or Audiobookshelf

There's bound to be some software out there that still only uses ANSI strings, but they are probably either very old or pieces of junk in general. 15 years ago this might have been a problem, but these days nearly all modern software use Unicode internally.

But you make a good point about those annoying OCD people lol.

anthonynogales commented 2 years ago

But you make a good point about those annoying OCD people lol.

For the record to the OCDs out there, I'm one of them. However, like I previously stated, I'm just getting started so it wouldn't be the chore I'd be facing if I had to go back and update everything to match this new thing but I'd also accept the future and script the change or something.

It's good to know that the characters should work on nearly all modern software. My concern was if this would act up on both Android and iOS file systems and then on drives formatted in exfat, zfs, btrfs, ext4, and whatever macOS is using these days and that it would be just fine on a samba share. I just didn't want to commit to setting down this path only to find that it doesn't work on something I might use it on. See, folks, I can be OCD too.

wtanksleyjr commented 2 years ago

My initial reaction was REALLY negative, but then I chilled -- this will only matter to people who explicitly enable the renaming, so they can actually expect this kind of change.

If I had to deal with this in my scripts, though, it would simply be the end of my scripts, it doesn't matter how OCD you are if you cannot even tell why your regexps are failing to match.

On Tue, Jun 21, 2022 at 8:53 AM Mbucari @.***> wrote:

if these characters didn't play well with Plex or Audiobookshelf

There's bound to be some software out there that still only uses ANSI strings, but they are probably either very old or pieces of junk in general. 15 years ago this might have been a problem, but these days nearly all modern software use Unicode internally.

But you make a good point about those annoying OCD people lol.

— Reply to this email directly, view it on GitHub https://github.com/rmcrackan/Libation/issues/285#issuecomment-1161944882, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJ7H6IWK355ZPCEMBR7EN3VQHQORANCNFSM5ZMYPFFA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

Mbucari commented 2 years ago

For the record to the OCDs out there, I'm one of them.

I really was joking. OCD people are the ones who make bug reports and feature requests because they like things just so. Non-OCD people see something they don't like and usually just shrug and deal with it. I fall on the OCD side.

act up on both Android and iOS file systems and then on drives formatted in exfat, zfs, btrfs, ext4, and whatever macOS is using these days

I checked, and all major file systems support Unicode paths. Additionally, >90% of Android software is written in Java, and all Java strings are Unicode so the devs don't even need to think about it for it to work. I know nothing about IOS, but Apple is not stupid and quite modern (bleeding edge, even), so I expect the same is true in their ecosystem.

@wtanksleyjr

Sorry about that. If you do use this feature in the future, would you be able to adjust your scripts to cope or would you just want an option to do standard illegal character replacement (e.g. _)?

wtanksleyjr commented 2 years ago

Like I said, someone who's doing renaming using their own script anyhow would just have to deal with the file as it appears, and would probably use the metadata in the JSON export to decide what the new name should be.

It would be nice, though, to have a lo-fi option in addition to the 16-bit hifi characters :) . So I'd vote in favor of being able to "filter out" illegal characters with underscores, as you suggest.

-Wm

On Tue, Jun 21, 2022 at 9:11 AM Mbucari @.***> wrote:

For the record to the OCDs out there, I'm one of them.

I really was joking. OCD people are the ones who make bug reports and feature requests because they like things just so. Non-OCD people see something they don't like and usually just shrug and deal with it. I fall on the OCD side.

act up on both Android and iOS file systems and then on drives formatted in exfat, zfs, btrfs, ext4, and whatever macOS is using these days

I checked, and all major file systems support Unicode paths. Additionally,

90% of Android software is written in Java, and all Java strings are Unicode so the devs don't even need to think about it for it to work. I know nothing about IOS, but Apple is not stupid and quite modern (bleeding edge, even), so I expect the same is true in their ecosystem.

@wtanksleyjr https://github.com/wtanksleyjr

Sorry about that. If you do use this feature in the future, would you be able to adjust your scripts to cope or would you just want an option to do standard illegal character replacement (e.g. _)?

— Reply to this email directly, view it on GitHub https://github.com/rmcrackan/Libation/issues/285#issuecomment-1161964644, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJ7H6LNL6TYUUCDHK2QWD3VQHSRJANCNFSM5ZMYPFFA . You are receiving this because you were mentioned.Message ID: @.***>

anthonynogales commented 2 years ago

So I'd vote in favor of being able to "filter out" illegal characters with underscores

Agreed. Even though I plan on going this new route and am really digging it saving a few characters in the path length, having the option to toggle it on and off somehow would be cool.

rmcrackan commented 2 years ago

I'm ambivalent. This is great for people who want it look the same, bad for people who want it to remain pure, and hell for anyone wanting to program against the results.

Personally, I'd prefer to know what I'm really looking at. Seeing a colon that's not really a colon or a question mark which isn't really a question mark raises red flags for me because of my history in security and programming. These are the kinds of tricks scammers use to avoid detection and the things pranksters use to punk their coworkers (google: greek semicolon prank). Outside of malice it can simply make debugging a nightmare as you pour through code looking for something which doesn't exist. Similarly a simple search fails because you search for what you think the character is. Likewise for if someone later is trying to script against our output.

On the other hand, people like me are the vocal minority. I'm sure plenty of folks want their stuff to 'just work'.

I'd like this option to be a future feature.

anthonynogales commented 2 years ago

I tested this colon replacement with my Mp3tag format string and the results look good in Windows Explorer.

..\Drew Karpyshyn\Star Wars꞉ The Old Republic (1)꞉ Revan {Marc Thompson}\Star Wars꞉ The Old Republic (1)꞉ Revan.m4b

However, @rmcrackan mentioned security and programming red flags. This is getting a little off topic (perhaps) but would it be best practice to avoid this altogether? Since I'm just starting on organizing my books, I'm trying to do it in the most future-proof way that follows whatever the best practices would be. I mean I realize its hard to predict the future but I'd like to setup the best long-term solution I possibly can.

rmcrackan commented 2 years ago

@Mbucari Remember, we're going to see this stuff in debug logs at some point :)

@anthonynogales The only truly future-proof way will be to limit yourself to printable characters allowed by windows as found on a 0-127 ascii table (the kind of table that end with [DEL]). These standards have been around since forever and are too embedded in things to leave in our lifetime. (I say windows because it's more restrictive than other OSs so this subset works for them too.)

That said, I don't think those new pseudo-colons will give you any trouble from a file management perspective.

If you ever try to script against them though, you have to somehow remember to compensate for that, else you'll go mad wondering why your script doesn't match the colon which is "obviously" right in front of you.

anthonynogales commented 2 years ago

Understood! I think I'll dream up some scenarios to script with Bash and PowerShell tonight to make sure I understand how to do that properly. You mentioned creating a future feature? Want me to create one to toggle this feature on/off if there isn't one created already? (Actually, I'm new to GitHub so I don't even know how to create a feature request or if that's even a thing. Shutting up now.)

I think this question has been answered. Should I close the issue or do we do something else in this scenario?

rmcrackan commented 2 years ago

I think this question has been answered. Should I close the issue or do we do something else in this scenario?

I'll turn it into a feature request. If we move forward with that, we have the whole discussion here as reference. If not, it's easy enough for me to close later.

Mbucari commented 2 years ago

I've put together a new illegal character replacement in my repo. Can you download it and take it for a spin? Settings are accessed from the Download and Decrypt tab. I'm out of the office all day tomorrow, so I'm not available for any help/debugging.

rmcrackan commented 2 years ago

Can you download it and take it for a spin?

I'll try but no promises. My job has been stupid busy lately. Tight deadline, one out on parental leave, one not yet fully trained, another on vacation, and something going around which I assume is covid taking out a not insignificant portion of our staff. My Libation time has been low lately.

rmcrackan commented 2 years ago

Will be included in next release. Thanks @Mbucari !