Anime-Lists / anime-lists

115 stars 52 forks source link

[Maintance] Cleaning anime-list-master.xml #283

Closed Middlepepper closed 1 year ago

Middlepepper commented 1 year ago

Hi Team, I think we might need to do a scripted clean of the anime-list-master.xml so that it aligns with the rules within the readme. Specifically I'm refering to

  1. Removing and any entries with tmdbid, imdbid not blank when a tvdbid="72070" is present. (e.g. Initial D Battle Stage (anidbid="11")
  2. Update anime-list-todo.xml type to align Type mapping with anidb (web = web, OVA = OVA, TV Special = tv special, TV = unknown)

I think we can do these easily with a script, the only issues I see with updating Type is Web Release like Jo no Kimyou na Bouken: Stone Ocean so we might need to exclude ONA/Web in any scripts.

I think if we do this we can even have the current data imported into anidb, seeing they mention this in an archived discussion where they now added IMDB and TVDB fields to anidb. Add external resources for general movie/TV sites

Middlepepper commented 1 year ago

As a example I made a commit here to show how much the clean-up for Step 2 could be ExampleofScriptingChanges. This is just checking the todo.xml.

Middlepepper commented 1 year ago

For Step 1, I've made a script to check how many entries have a tmdbid or imdbid value when the tvdbid is a number. just to check how much mess there is in anime-list-master.xml

So there currently is a some mess in the master file.

Link to branch I made with example changes Removed imdbid and tmdbid

BrutuZ commented 1 year ago

The scripts are grandfathered in from ScudLee. Maybe they've only been touched in this fork to run automatically via GitHub actions?

  1. Removing and any entries with tmdbid, imdbid not blank when a tvdbid="72070" is present. (e.g. Initial D Battle Stage (anidbid="11")

Not always applicable. There are movies and OVAs that exist both as TVDB specials or even own seasons and as TMDB\IMDB movie entries. No reason to get rid of those, it has valid information for either scenario (tagging episodes or movies).

What the readme refers to is not using TMDB/IMDB show entries, as in themoviedb.org/tv/000000 or https://www.imdb.com/title/tt14626352/episodes/. Basically because TMDB only has a numeric code that can be shared between different media and IMDB I don't really see a reason other than it being a nightmare to keep a parser updated. I'm guessing just for consistency? The former could theoretically be fixed with a new property for shows like tmdbtvid

  1. Update anime-list-todo.xml type to align Type mapping with anidb (web = web, OVA = OVA, TV Special = tv special, TV = unknown)

The information is not available on the AniDB dump the update script fetches. Would need to switch to a more complete type and rewrite the XSL for that

Middlepepper commented 1 year ago

Ahh Your right I made a misstep in my logic. I didn't account for movies, TV specials and OVAs being in the TVDB specials. let me let me correct for it.

So if we have a correct and current TVDB that is a number and the season is not 0 have a IMDB and TMDB this is a conflict as These will map to a movies in IMDB and TMDB and the TVDB would be for a normal show.

I can look into making a XSL for that with the above (updated) logic so it just clean up any entries as a maintenance task in the Git workflow. But this might not be a issue so I'll double check

For the other point (step 2)

Yeah if we don't have it in the dump I don't think we can automate it with the XSL and looking at it would be easier to do locally and manually with a script.

I know the anime-list-todo.xml will be wiped on Github Sundays but using it as a example I can make a local script to fill in the Type via the anidb type so that our Unknown list is filled with TV Shows by Default which should help with filling them in?

The examples I did showed some promise as it so it might be worth while compared to step 1. I'll do a cleaner pull and we can discuss it in that?

So making sure I have everything straight. the logic is that

TVDBIDs that are set as specials can have a IMDB and TMDB values as these will map correctly for example anidbid="11" For example with.

  <anime anidbid="11" tvdbid="**70900**" defaulttvdbseason="**0**" episodeoffset="2" tmdbid="327796" imdbid="tt7941838">
    <name>Initial D Battle Stage</name>
  </anime>

Which are all valid links using our requirement of the default url for each value.

tvdbid=https://thetvdb.com/index.php?tab=series&id=70900 (S00E03) tmdbid=https://www.themoviedb.org/movie/327796 imdbid=https://www.imdb.com/title/tt7941838/

I think we might need to look at tmdbtvid or something measure in future now that TVDB has also expanded into using movies https://thetvdb.com/movies/initial-d-battle-stage] So we might need a separate type field to help with sorting.

github-actions[bot] commented 1 year ago

This issue is stale because it has been open 150 days with no activity. Remove stale label or comment or this will be closed in 30 days.

github-actions[bot] commented 1 year ago

This issue has been stale for 30 days and is being closed.