openzim / ted

Provide the best of TED.com for offline usage!
https://download.kiwix.org/zim/ted/
GNU General Public License v3.0
13 stars 8 forks source link

add support for grabbing all videos if languages is not set #174

Closed elfkuzco closed 3 months ago

elfkuzco commented 3 months ago

Fix #171

Rationale

When --languages is omitted from the command-line arguments, the scraper should search for all available languages of that particular video and add it to the videos list.

Changes

codecov[bot] commented 3 months ago

Codecov Report

Attention: Patch coverage is 0% with 78 lines in your changes are missing coverage. Please review.

Project coverage is 0.00%. Comparing base (d57d160) to head (fd17a7c). Report is 3 commits behind head on main.

Files Patch % Lines
src/ted2zim/scraper.py 0.00% 78 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #174 +/- ## ===================================== Coverage 0.00% 0.00% ===================================== Files 7 7 Lines 1011 1055 +44 Branches 215 228 +13 ===================================== - Misses 1011 1055 +44 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

elfkuzco commented 3 months ago

@benoit74 , I pulled the recent changes from the upstream branch and when I try to initiate a new shell using hatch shell, I get a dependency resolution conflict. I am using python 3.12+ as other versions are not compatible when I run hatch shell. Screenshot_20240326_135622

elfkuzco commented 3 months ago

I think the issue with hatch shell not resolving dependencies has to do with the python version because everything was working fine as of 3.11+

benoit74 commented 3 months ago

CI is passing, so configuration is OK and problem is definitely linked to your local environment.

It happens to me as well sometimes, solution I'm using for now is to cleanup hatch venv (delete the whole venv) and restart from a fresh one.

elfkuzco commented 3 months ago

Here are also different json files of the videos with different flags supplied. The names of the files are descriptive of the flags supplied. playlist_without_languages.json playlist_with_languages_subtitles_all.json playlist_with_languages.json topics_without_languages.json topics_with_languages_subtitles_all.json topics_with_languages.json

benoit74 commented 3 months ago

Thank you, looks promising. Unfortunately I won't be able to review it today, I have other matter to finish and starting tomorrow I'm on holidays till Thursday next week. Do not expect a quick feedback, sorry about that, I promise I will have a look at it next week.

benoit74 commented 3 months ago

I forgot to force-push my changes with the squash and CHANGELOG ... I've canceled the merge manually with a force push and opened https://github.com/openzim/ted/pull/176 to merge properly ...

elfkuzco commented 3 months ago

Okay