openzim / zim-requests

Want a new ZIM file? Propose ZIM content improvements or fixes? Here you are!
https://farm.openzim.org
37 stars 2 forks source link

Terra X #272

Open Popolechien opened 4 years ago

Popolechien commented 4 years ago

Files are hosted in a Commons category: do we have a tool to scrape that?

RavanJAltaie commented 1 year ago

@kelson42 @rgaudin Do we have a tool to scrape that?

rgaudin commented 1 year ago

Since it's just a collection of files, a mini scraper retrieving the list of files and their metadata via the API and producing a nautilus collection JSON might be a good option

RavanJAltaie commented 11 months ago

I have a started a nautilus recipe https://farm.openzim.org/recipes/Terra_x_de Will update the status accordingly

kelson42 commented 11 months ago

My attempt would be:

If it does not work, then this would be a bug

I don't see how nautilus coykd do the job without preparatory work (me aybe this has been done)

Recipe name does not respect the norm

Popolechien commented 11 months ago

@kelson42 Does WP1 work on Commons?

kelson42 commented 11 months ago

Hmmm, it should IMO. Sorry, on the road again, difficult for me to test right now.

rgaudin commented 11 months ago

I don't see how nautilus coykd do the job without preparatory work (me aybe this has been done)

It can't obviously. My comment mentioning nautilus clearly indicated it required a mini-scraper that would produce the nautilus-friendly data.

Many things wrong with this recipe. Archive config mentions URL to a ZIP archive containing all the documents. How can this commons link be considered a ZIP archive?

The recipe failed… on the favicon because its URL is incorrect.

rgaudin commented 11 months ago

Hmmm, it should IMO. Sorry, on the road again, difficult for me to test right now.

Commons is not in the list of Projects (neither simple, SPARQL or petscan)