openzim / cms

ZIM file Publishing Platform
https://cms.openzim.org
GNU General Public License v3.0
4 stars 0 forks source link

Title reconciliation script #61

Closed fadiga closed 2 years ago

fadiga commented 2 years ago

Fixes #38

codecov[bot] commented 2 years ago

Codecov Report

Merging #61 (5eb9967) into main (660b043) will not change coverage. The diff coverage is 100.00%.

:exclamation: Current head 5eb9967 differs from pull request most recent head a5323c0. Consider uploading reports for the commit a5323c0 to get more accurate results

@@            Coverage Diff             @@
##              main       #61    +/-   ##
==========================================
  Coverage   100.00%   100.00%            
==========================================
  Files            8        10     +2     
  Lines          244       402   +158     
==========================================
+ Hits           244       402   +158     
Impacted Files Coverage Δ
backend/src/backend/constants.py 100.00% <100.00%> (ø)
backend/src/backend/main.py 100.00% <100.00%> (ø)
backend/src/backend/models.py 100.00% <100.00%> (ø)
backend/src/backend/routes/books.py 100.00% <100.00%> (ø)
backend/src/backend/routes/languages.py 100.00% <100.00%> (ø)
backend/src/backend/routes/tags.py 100.00% <100.00%> (ø)
backend/src/backend/routes/titles.py 100.00% <100.00%> (ø)
backend/src/backend/schemas.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 660b043...a5323c0. Read the comment docs.

rgaudin commented 2 years ago

There is no detail about how the script should be run nor what should be expected from running it.

Running the script, I have a lot of lines like Name is not remove list.remove(x): x not in list. Is this expected? What does this mean?

Please be more verbose in the names of the Table of stats as understanding exactly what each number mean is important.

I see the CSV files and those look useful. Two comments:

👍

kelson42 commented 2 years ago

I guess you need this PCRE regexp https://github.com/kiwix/maintenance/blob/master/library-docker/bin/manageContentRepository.pl#L109

You shoukd not use to extract info like in the Perl script, but to secure filename is OK.