RobLoach / libretro-thumbnails-check

Checks consistency of libretro-thumbnails
https://github.com/libretro-thumbnails/libretro-thumbnails
20 stars 4 forks source link

[idea] create thumbnail manifests for libretro-thumbnails scraper support #6

Open markwkidd opened 7 years ago

markwkidd commented 7 years ago

I have been dabbling with a playlist generator and thumbnail matcher for the complete sets at http://thumbnailpacks.libretro.com/

Over time I have wound edging closer to a 'scraper' mode that can also download individual thumbnails from http://thumbnails.libretro.com/

Assuming that scraping support is a function that the RA team is interested in for the thumbnails.libretro.server (I think it is), some kind of manifest is at least one more piece of infrastructure that is needed.

Right now a scraper can either parse the HTML directory that is generated, or just query directly against the server to see if filenames exist. It would be much cleaner if there was a simple manifest file in each folder with the name of each file. Perhaps also the file size or CRC to help with validation.

@RobLoach your work on these scripts seems to be the closest to having this functionality already. Does this make sense? Do you think scraper support is in fact desired?

markwkidd commented 7 years ago

If you're interested in the functionality of the app (and have access to windows, sorry about single platform) here's the little prototype app which can download individual files: https://libretro.com/forums/showthread.php?t=7802

RobLoach commented 7 years ago

Great idea, would be nice to have it "automagically" fill in some of the gaps. We'd want to make it only scrape content that's not already there.

It is something that libretro-thumbnails-check could do. Currently the script....

  1. Indexes all games in the libretro-database
  2. Checks which art is available for each game in libretro-thumbnails (through the GitHub Trees API)
  3. Outputs the report in the out directory

The missing thumbnail data is available to us after step 2, so we could add some functionality to download missing art for us. Do you know any good game thumbnail services out there?

markwkidd commented 7 years ago

sselph's scraper and the Universal XML Scraper both plug into the screenscraper.fr API. That API expects the hash value of the game ROM in order to serve back the scraped image. It might be easier to rig up a bridge to one of those scripts, or to connect directly to screenscraper? https://translate.google.com/translate?hl=en&sl=fr&u=http://www.screenscraper.fr/webapi.php%3Falpha%3D0%26numpage%3D0&prev=search

markwkidd commented 7 years ago

The sselph scraper can also pull from TheGamesDB.net, which has its own published API http://wiki.thegamesdb.net/index.php/API_Introduction

edit: This API has a straighforward name-based matching that might be easy to plug into. Example: http://thegamesdb.net/api/GetGamesList.php?name=halo

zach-morris commented 7 years ago

@RobLoach,

Not sure how 'good' the data is, but the databases I've generated for my addon contain metadata and images in an XML format collected from a large number of sources. The XML typically contains either the No-Intro filename or the games hash (or both) to cross reference. You may freely use the data as you wish if it helps fill in the gaps here, although obviously some scripting would have to occur to grab the images, check if they meet requirements / convert them to meet requirements.

RobLoach commented 7 years ago

Nice work on the database. Looks to contain lots of information, along with fanart detail too. Must have taken a while to compile that kind of list!

RobLoach commented 1 year ago

Just following up here. Zach made index files for each of the repos. This checker thing is now using those.....

https://thumbnails.libretro.com/Atari%20-%20Lynx/.index?sdfaafsd