glujan / drpg

Download and keep up to date your purchases from DriveThruRPG
MIT License
18 stars 5 forks source link

mapping company and product names to directory names is different than DTRPG client #43

Closed ToddBradley closed 10 months ago

ToddBradley commented 1 year ago

Hi, I just found this project and was very happy to see someone else put this together. Yesterday, fed up with DTRPG's lack of Apple Silicon support, I started reverse engineering the DTRPG API that their client app uses, with the goal of building my own simple Python client app. But I see you've already done that, which is awesome! And the code is beautiful.

There are some differences I've spotted between drpg and the official client app, though. And so I wondered if your intent was to make them interoperable. Two of the differences I've seen are that the algorithms the two pieces of software use to map company and product names into directory names aren't the same. Here are a couple of examples:

The DTRPG client app saved this file: /Game Designers__039_ Workshop _GDW_/Ebook_ The Science Fiction In Traveller/The-Science-Fiction-In-Traveller-Kindle.jpg while your drpg code saved this file: /Game Designers' Workshop (GDW)/Ebook - The Science Fiction In Traveller/The-Science-Fiction-In-Traveller-Kindle.jpg

Another example:

Official client app: /Michael Brown/The Eyes of Winter _Holiday Adventure_/The_Eyes_of_Winter.pdf

drpg: /Michael Brown/The Eyes of Winter (Holiday Adventure)/The_Eyes_of_Winter.pdf

You could argue which is better, but the main point is that they're different. Should they be?

cas206 commented 1 year ago

Just a users opinion.... I would prefer they came out the same. Some if us have a mix of OS's and so get duplicates if using both apps to sync a single directory.

glujan commented 1 year ago

This is a great idea! 🚀

There is no good reason why directory layout is different from the official client. If having an uniform layout is better for dual-boot or NAS users, why not doing that?

I have one small concern, though - changing how directories are named is backwards incompatible and some migration should be implemented so users won't have duplicates on their filesystem. This could be done by having a separate subcommand to update existing directories or maybe opt-in (or opt-out?) flag for the existing command, ie:

drpg update-paths # subcommand that will update paths to match the official client
drpg --update-paths --token XYZ # opt-in to update paths
drpg --legacy-paths --token XYZ # opt-out to use existing paths

Of course having just one of above is enough and please take proposed names with a grain of salt 😃

ToddBradley commented 1 year ago

What if the client app was smart enough to notice that a file name is in the old format, and then attempt to convert it to the new format? The assumption would be that people want the new format and are only using the Python script to maintain their library from now on. This assumption could be overridden by a flag like --compatibility-mode which would make drpg create files named the same as the official client.

glujan commented 1 year ago

@ToddBradley that sounds good and is quite similar to what I meant by:

drpg --update-paths --token XYZ # opt-in to update paths

Do you feel like preparing a PR for this or should I do it?

ToddBradley commented 1 year ago

Do you feel like preparing a PR for this or should I do it?

I'm getting married this weekend, so I'm pretty busy the next couple of weeks. If you're eager to implement this in the near term, go for it! If you aren't, then maybe I'll work on it in August sometime.

glujan commented 1 year ago

I'm getting married this weekend

Congratulations!

I hope to find some time to prepare a branch for this feature. If so, it will be linked to this issue.

glujan commented 1 year ago

I plan to tackle this in September. If you're eager for this feature and want to contribute a pull request earlier, please let me know.

ToddBradley commented 1 year ago

@glujan I had hoped to find the time to work on this, but I realize that tomorrow is September and I have been too busy. I'm also going to be on vacation in the wilderness far from any computers for two weeks in September. So, if you have time to tackle the issue, go for it. If it's still open when I do have more time, I'll eventually get to it.

glujan commented 12 months ago

@ToddBradley Sorry but I haven't started yet. When I do, I'll definitely drop a message here.

glujan commented 10 months ago

I want to tackle this in the second half of November.

I probably won't be too strict regarding backwards compatibility since this is something I want to have in drpg since the issue was opened. It's better to implement this more loosely than not implement this at all, I guess.

ToddBradley commented 10 months ago

I was playing around with this a little and ended up fixing a "bug" that exists in both the official DTRGP app and in drpg. They both unescape HTML characters in a name incorrectly. Going back to my original examples...

The DTRPG client app saves this file: /Game Designers__039_ Workshop _GDW_/Ebook_ The Science Fiction In Traveller/The-Science-Fiction-In-Traveller-Kindle.jpg

drpg saves this file: /Game Designers' Workshop (GDW)/Ebook - The Science Fiction In Traveller/The-Science-Fiction-In-Traveller-Kindle.jpg

But what should really be saved, in my opinion, is this:

/Game Designers' Workshop (GDW)/Ebook - The Science Fiction In Traveller/The-Science-Fiction-In-Traveller-Kindle.jpg

In other words, the "&#039" that the REST API is sending back in the name is an HTML escaped apostrophe character. When saving the file to disk, it should be turned back to an apostrophe. The official app converts it into __039__ for some reason, and drpg just assumes that's actually the name of the publisher.

So what do you think we should do about that? (I already have a unit test and the code that implements the fix, but don't know what to do with it)

Here are the commits for the above: https://github.com/ToddBradley/drpg/commit/db1956fadcac8f94f5f729ae5f1b2f5ce96ff9bd

ToddBradley commented 10 months ago

After thinking about it some, what I wrote above (https://github.com/glujan/drpg/issues/43#issuecomment-1785877251) should really be its own separate issue. But I'm implementing a fix for this original issue that will include the HTML escaping thing in it. So I'm not gonna write it up separately.

ToddBradley commented 10 months ago

OK, here's what I've got at the moment. I added a command line option called --compatibility-mode. I went through my whole DriveThruRPG library looking for oddities in directory names created by their proprietary client app, and I wrote up that behavior into drpg's compatibility mode. There may be other oddities and bugs that I didn't see, since I can't know every weird piece of data in their publisher database.

Now, when drpg runs without that flag, it uses its own naming algorithm. That is mostly what was already there, but with a couple of changes. One change is to unescape escaped characters like in my example above of "Game Designers' Workshop (GDW)". Also, at the moment, I removed the code that converted colons in directory names to the placeholder " - ". I personally use drpg on macOS, which allows colons in filenames. Linux does, too. Windows is the only one that doesn't.

So that leaves us with a dilemma. Do we want the naming to be consistent among all operating systems? If so, we should go back to converting colon to " - ". Or do we want colon to be allowed on systems that allow it?

I have to say it makes the directory names somewhat more readable, and is more consistent with what the publishers actually named the books. My own library has a lot of e-books with names like "0one's Blueprints: The Ruined Town" and "Godbound: A Game of Divine Heroes".

I did not make any attempt to convert file/directory names from one format to another. I made that choice for two reasons. First, people may be using both clients in ways I can't imagine. Second, I don't like the idea of deleting anybody's e-books. Third, it's a hard problem and I wanted to get the rest of this working first, since it's much more important.

Regardless, here is a draft PR to look at while we discuss the consistency matters. https://github.com/glujan/drpg/pull/49

glujan commented 10 months ago

@ToddBradley Thanks a lot for preparing the PR and putting that much work into explaining the topic, comparing with your own library et al. I really appreciate it!

I added a command line option called --compatibility-mode (…) I did not make any attempt to convert file/directory names from one format to another (…)

This is a good idea. It keeps deciding on a naming scheme to the user and does not add too much complexity regarding file name migrations.

Do we want the naming to be consistent among all operating systems?

I'd rather use the common denominator across different operating systems, with Windows being the most restrictive one.

My very own use case of drpg is that I'm mostly using drpg on my Linux box and keep a copy of my ebooks in the cloud. If I sync those files on my Windows box via cloud and run drpg on Windows I don't really want to have 2 copied of some of my ebooks. I could use the compatibility mode but the non-compatible mode still provides some more readability (and also it's easier to find a correct file or directory without those weirdly escaped HTML entities)

ToddBradley commented 10 months ago

OK, given that feedback, I updated the code to not allow colons in file/directory names. The PR is open for review.

glujan commented 10 months ago

I've just released changes from #49 as version 2023.11.10. Thanks!