openzim / python-scraperlib

Collection of Python code to re-use across Python-based scrapers
GNU General Public License v3.0
17 stars 16 forks source link

Add utility to validate ZIM folder and filename are OK #169

Closed benoit74 closed 2 weeks ago

benoit74 commented 2 weeks ago

Fix #163

codecov[bot] commented 2 weeks ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 100.00%. Comparing base (baec9c1) to head (f4176f0).

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #169 +/- ## ========================================= Coverage 100.00% 100.00% ========================================= Files 32 32 Lines 1393 1423 +30 Branches 240 243 +3 ========================================= + Hits 1393 1423 +30 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

benoit74 commented 2 weeks ago

It's OK except some naming for which I suggested alternatives.

I've started to build a cheatsheet to stop forgetting about these namings for exc, fpath, ... : https://github.com/openzim/overview/wiki/Python-naming-conventions

Do not hesitate to contribute, you're probably way more aware than I am about all these

Is this all necessary? I'm fine with it but I would have been satisfied with a single Exception and custom messages. This would fix the issue but with a smaller API extension.

This was "necessary" because in warc2zim we have different exit codes depending on the problem encountered. I tend to consider that multiple exception is not a high price to pay in term of development or maintenance, and that it is important to keep open all potential use cases open in a shared library.

benoit74 commented 2 weeks ago

I forgot to commit my changes, so none of your conversations are fixed in fact. Sorry, I've opened https://github.com/openzim/python-scraperlib/pull/171 with all fixes ...

rgaudin commented 2 weeks ago

I've started to build a cheatsheet to stop forgetting about these namings for exc, fpath, ... : https://github.com/openzim/overview/wiki/Python-naming-conventions

Do not hesitate to contribute, you're probably way more aware than I am about all these

I will, but I only see them when it's not aligned 😉

Is this all necessary? I'm fine with it but I would have been satisfied with a single Exception and custom messages. This would fix the issue but with a smaller API extension.

This was "necessary" because in warc2zim we have different exit codes depending on the problem encountered.

Great.

I tend to consider that multiple exception is not a high price to pay in term of development or maintenance, and that it is important to keep open all potential use cases open in a shared library.

It's the little things that piles up and end-up slowing down the whole project. We've already decided long ago, from experience, that we only implement what's needed. There's always the possibility to extend and modify when need grows or change ; including for such libraries. This is just a general feedback on your general last sentence of course 😉