openzim / gutenberg

Scraper for downloading the entire ebooks repository of project Gutenberg
https://download.kiwix.org/zim/gutenberg
GNU General Public License v3.0
126 stars 37 forks source link

Use zimscraperlib (in place of zimwriterfs) + upgrade to Python 3.11 + dependencies #157

Closed benoit74 closed 1 year ago

benoit74 commented 1 year ago

Fix #136 Fix #156

benoit74 commented 1 year ago

Some questions / remarks:

kelson42 commented 1 year ago
  • I did not achieved to find the documentation regarding the standard minimal Python code guideline (black + isort + flake with 88 chars max length if I did it appropriately in iFixit)

It’s here https://github.com/openzim/overview/wiki but clearly not complete.

benoit74 commented 1 year ago

Thank you for all these comments, I will work on it later this week, seems pretty easy to fix.

benoit74 commented 1 year ago

@rgaudin, I fixed all inline comments except the one regarding the SQL timeout, see proposition above.

I also fixed following issues (well spotted 👏):

I don't achieve to reproduce this issue (could you provide more info?):

Regarding the link to the PDF which should not be displayed, I confirm this seems not linked to current PR. Issue is present in many places: cover article (gutenbergtozim/templates/cover_article.html), HTML version of a book (gutenbergtozim/templates/book_infobox.html) and bookshelves (2 times in gutenbergtozim/templates/js/tools.js).

kelson42 commented 1 year ago

@benoit74 Any chance to fix the CodeFactor issues? Usually the only one which are tolerated are the "complex method" ones (which would need a massive revamping and often are older than the PR itself).

rgaudin commented 1 year ago

@benoit74 Any chance to fix the CodeFactor issues? Usually the only one which are tolerated are the "complex method" ones (which would need a massive revamping and often are older than the PR itself).

@kelson42 the codefactor issues are from the legacy code. They show up now because the file was touched (to fix the links) but were not introduced now.

benoit74 commented 1 year ago

@benoit74 Any chance to fix the CodeFactor issues? Usually the only one which are tolerated are the "complex method" ones (which would need a massive revamping and often are older than the PR itself).

@kelson42 the codefactor issues are from the legacy code. They show up now because the file was touched (to fix the links) but were not introduced now.

Yep, I opened an issue to fix this, I would clearly prefer to have this PR merged first. This PR has changed many parts of the code, so delaying it too much is taking a risk of merge conflicts later. And I'm clearly not an expert in JS code, and this tools.js script seems a bit scary to fix without spending some time to understand all its purpose and confirm it is still working as expected afterwards.

rgaudin commented 1 year ago

Good to merge ;I'll do it tomorrow; thanks

benoit74 commented 1 year ago

BTW, I also opened this issue as discussed: https://github.com/openzim/gutenberg/issues/159

benoit74 commented 1 year ago

Good to merge ;I'll do it tomorrow; thanks

I can do it if you prefer, @kelson42 granted me sufficient rights. Do you mind if I squash all commits? Are you ok with fast-forward merges? (these are my preferences, but I clearly don't mind to follow other conventions)

rgaudin commented 1 year ago

Please do. Our convention is to use a merge commit though