openzim / youtube

Create a ZIM file from a Youtube channel/username/playlist
GNU General Public License v3.0
42 stars 26 forks source link

Add items to the ZIM "on the fly" #262

Closed dan-niles closed 1 month ago

dan-niles commented 1 month ago

Close #204 Close #209

Changes in this PR:

codecov[bot] commented 1 month ago

Codecov Report

Attention: Patch coverage is 0% with 115 lines in your changes missing coverage. Please review.

Project coverage is 1.58%. Comparing base (5ae8960) to head (1815f28).

Files Patch % Lines
scraper/src/youtube2zim/scraper.py 0.00% 113 Missing :warning:
scraper/src/youtube2zim/schemas.py 0.00% 2 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #262 +/- ## ======================================== - Coverage 1.63% 1.58% -0.05% ======================================== Files 11 11 Lines 1040 1070 +30 Branches 156 160 +4 ======================================== Hits 17 17 - Misses 1023 1053 +30 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

benoit74 commented 1 month ago

Nota: just so that it is clear for our future-self, we removed the --no-zim argument since we are now adding files on-the-fly to the ZIM, so there is no added-value / sense to ask to not create the ZIM.

benoit74 commented 1 month ago

Oh and btw, we should remove as well the keep_build_dir / --keep as well, build_dir content is not interesting anymore to be kept.

And the documentation about --tmp-dir argument should be revisited, it is not really accurate anymore.

dan-niles commented 1 month ago

I recommend to move as much as possible every steps before the long download_video_files step. I think that download_authors_branding, add_main_channel_branding_to_zim and add_zimui are good candidates.

I have moved add_main_channel_branding_to_zim and add_zimui to run before the download step in 869b499. However download_authors_branding cannot be moved since it depends on the get_videos_authors_info method running before it.

get_videos_authors_info fetches all the information about the authors of videos and then download_authors_branding downloads the branding for the fetched authors.

dan-niles commented 1 month ago

--keep CLI argument removed in 57f5a5d. I also updated the documentation for --tmp-dir in the same commit.