openzim / zim-requests

Want a new ZIM file? Propose ZIM content improvements or fixes? Here you are!
https://farm.openzim.org
40 stars 2 forks source link

New Zim File for Bibe Books #640

Open RavanJAltaie opened 1 year ago

RavanJAltaie commented 1 year ago

As a part of scouting Grey Box content, we need to create zim file for the details below:

Website URL: https://www.bibebook.com/ License: Creative Commons Desired ZIM Title: Bibe Books Desired ZIM Description: 1700 ebooks gratuits Desired ZIM Icon –png (URL or attach one): Language (ISO 639-3): fra Is this a MediaWiki?: no

RavanJAltaie commented 1 year ago

Recipe Created https://farm.openzim.org/recipes/bibebook.com_fr_all

RavanJAltaie commented 1 year ago

@rgaudin the scraper here started since 10 days ago, is than normal? Shall I wait more?

rgaudin commented 1 year ago

You should take a look at the website before creating the recipe.

Search engines should be excluded always (except for client-side search engine) because they generate tons of useless, mostly duplicated pages.

RavanJAltaie commented 1 year ago

@rgaudin you are correct, it didn't start for 20 days, it will not work out with this scraper. I'll keep an eye for any websites with search engine.

benoit74 commented 2 weeks ago

Custom scraper seems to be pretty easy to write, the website looks not updated anymore.

If one goes to http://www.bibebook.com/files/download/zip/packs_classiques/, it is visible the packs have not been updated since 2016.

The whole catalog is available at http://www.bibebook.com/files/download/catalogues/Bibebook-library.uni in XML format.

Scraper can hence simply:

Good way to learn how to write a scraper tbh.