winterbird-code / adbb

Object Oriented UDP Client Library for AniDB
GNU General Public License v3.0
17 stars 4 forks source link

Support for additional file attributes #11

Closed freitagdavid closed 2 years ago

freitagdavid commented 2 years ago

So I'm looking to write a little file renamer and before I get too deep I'm wondering how much of the data you grab for each file put in? I have somewhat complex needs and aniadd while it can do the job doesn't like working across disks on Linux for some reason and I found this. Being able to just straight out code my renaming scheme sounds soooo nice. Though I'm wondering the depth to which you extract data from anidb. Most of what I'm seeing seems a touch surface level. Along with that where abouts would I look in the code to find the places to add new fields like for instance crc if that isn't in there (I didn't see it at least) I guess the best comparison is how on par would the data gathering be with something like aniadd. Also I'm assuming this rate limits even when used as a library, or will I have to sleep my code to keep from getting banned.

winterbird-code commented 2 years ago

Thanks for the interest!

It's basically an object oriented wrapper around the ANIME, EPISODE, GROUP, FILE and MYLIST commands in the UDP API, so those attributes are available (and I think they should be documented in the README). I only calculate the ed2k-hash, as it's the unique identifier for files in anidb, but if the ed2k-hash matches, so should any CRC or other hash calculated from the original file.

If you write your own renaming tool using this library you can obviously enrich it by using other libraries to extract metadata from the files; but right now this library only gives you what the API provides.

If you want some inspiration you can check the code for the arrange_anime tool (especially the arrange_files function) which basically does what you want, but with a hardcoded scheme (for now at least).

The anidb API is very restrictive. The library does what it can to do things right (rate limiting, caching to a database etc.) but if you have a fairly large amount of files you will get leech banned if you try to run it on everything at once. The polite thing if that happens is to wait at least 24 hours before resuming (and try to do less requests/day in the future). AFAIK this is the same for all clients, although some might handle it more gracefully....

winterbird-code commented 2 years ago

actually, when I look back at the API it seems that it supports more file information then I've implemented. My best guess is that the rationale was just as I wrote: if you have access to the file you can figure out that metadata anyway so there is no need to fetch it from the API. But it was a long time ago, so maybe I had some other reason (like being lazy) :slightly_smiling_face:

The supported attributes are documented in the README.

freitagdavid commented 2 years ago

Okay, I might look at trying to put a PR together to get more comprehensive data on the objects. I was actually tinkering with making my own client at one point but the api really threw me for a loop you've handled all the hard stuff adding a few more fields shouldn't be too difficult. At least when I was looking at the api mind you it's been a while so I may be misremembering I remember that when you queried for a file you could retrieve all related information to that file which would cover all that extra metadata. Why recalculate when it's already there. Though as I said I may be misremembering. I'm mostly thinking along the lines of the resolution, and whatnot. If at all possible it's nice to not have to include yet another library if the information is already available. That is assuming that doesn't require more calls.

I really hate that they still restrict the tcp api it's a really stupid protectionist restriction. But alas.

winterbird-code commented 2 years ago

I would prefer to extract metadata from the files when possible. I wouldn't mind to add a dependency on ffmpeg-python for this. A primary goal with this library is that it should be usable even if the files are not registered in anidb. I have lots of privately ripped files that will never be public available and shouldn't be added to anidb, but I still want them added to my mylist. This is why the library supports generic files. Extracting metadata from the local files when available have the advantage that it will work the same if the file is generic or not.

That said; I would probably accept a PR for handling more attributes from the API for now as I currently have no other plans to implement this.

I agree that a more modern API and with less restrictions would be nice; a lot has changed since they released the original API, and I think many of the restrictions are still in place because that's how it always has been and no one is engaged enough to push for a change...