openzim / gutenberg

Scraper for downloading the entire ebooks repository of project Gutenberg
https://download.kiwix.org/zim/gutenberg
GNU General Public License v3.0
126 stars 37 forks source link

New architecture plan #94

Open kelson42 opened 4 years ago

kelson42 commented 4 years ago

This project has been named because this is a Gutenberg project library importer/exporter. But we are just before making it more generic. This means this won't be only about Gutenberg, but also about other sources/types of books (See #93 for example).

Concretly we will have:

I see two ways to achieve to do that: 1 - Finally create a common Python library for all our Python scrapers and put the common part there. Each ebooks source (Gutenberg, Wikisource, etc.) would have then its own repo using the common part. We would also have one dedicated repo able to agregate everything 2 - Create one big repo (extending this one) able to do all the jobs around ebooks. 3 - We could also have - 2 - and still create a common library a s preparatory work for future more customised ebooks scrapers

Not sure what would be the best

dattaz commented 4 years ago

1 - Finally create a common Python library for all our Python scrapers and put the common part there.

This but not just for ebooks related scrapper, i think we need a python lib which is wrapper around zimwriterfs (at least)

kelson42 commented 4 years ago

This but not just for ebooks related scrapper,

Definilty, video mgmt is even a better candidate as we have already two different scrapers needing it

i think we need a python lib which is wrapper around zimwriterfs (at least)

"yes" and "no"... if someone could code simultaneously the python-libzim (like node-libzim) to be able to write ZIM on the fly (so without going through the fs) that would be even better.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.