Closed benoit74 closed 1 week ago
LGTM but it needs to be clearly identified as The TED Zim To Bind Them All or somesuch so that people don't download that along with the rest.
@RavanJAltaie thoughts?
I propose:
I agree with @benoit74 suggestion but truly I don't know why shall we do this? This will be a very big file with ultimately few people who will want/be able to download it. So what's the point?
I would personally prefer to use such a ZIM (even if quite big, probably in the order of 100GB) rather than choosing which topic I prefer and having videos duplicated between the ZIMs
I am not convinced either but then the cost is minimal and in a long-tail scenario I guess we'll always have users going for one or the other.
This will be a very big file with ultimately few people who will want/be able to download it. So what's the point?
I don't think it will be barely used. Let's say that I'm a target user. I'm interested in both science, technology, society and global issues. I'm mostly sure (mostly ready to bet 10 bucks 😅) that the sum of science
, technology
, society
and global issues
ZIMs sizes is going to be larger than the size of this "all" ZIM because lots of videos are redundant. This is going to be even worse if I want few more topics because there is some niche topics which are not totally covered by the topic already mentioned but still interest me like let's say innovation
and climate change
. I'm not even speaking about the fact that having all these "small" ZIMs is a pain in term of search / video navigation compared to one single big ZIM
Recipe created and requested: https://farm.openzim.org/recipes/ted_topic_all
Task failed due to https://github.com/openzim/ted/issues/213
I passed the list of all topics "manually".
It made me realize there is 5 new topics (generosity, wildlife, reproductive health, artificial intelligence, tech). I've created the corresponding recipes and they've started automatically.
It made me realize
I take it you realized it because these names struck you as new, but there is no automatized way to know when a new topic appears?
I take it you realized it because these names struck you as new, but there is no automatized way to know when a new topic appears?
Just the total number of recipes was displayed and it said "360" where I knew it was supposed to be "355".
Creating again the missing recipes is mostly automated, I "just" need to run a script on my machine. I still prefer to not fully automate this since it is quite important to check before really requesting the new topics, should the tool go wild for instance.
I propose to setup a workflow to create a quarterly issue to update the list of recipes (it takes me between 5 minutes - if nothing needs to be done - to 15 minutes - if new recipes needs to be created and I need to check them - to run this). Would that be ok?
LGTM
LGTM as well.
ZIM ready at https://library.kiwix.org/#lang=&q=all+ted
File size is is little bit less than 79GB. Interesting since total size of all other TED ZIMs is 419GiB.
File size is is little bit less than 79GB
Do we know how many videos are in there? I suspect the reduced size tells us a lot about how many duplicates are used over several topical lists.
Ah, I see 165 pages with 40 videos each = 6600 TEDs
This can also be checked with the "Counter" ZIM metadata
We are close to publish many TED ZIMs with videos filtered per topic (i.e. we will have one ZIM per TED topic).
What about creating one "all TED videos" ZIM, where we would fetch TED videos of all topics, no matter the associated topic?
I would personally prefer to use such a ZIM (even if quite big, probably in the order of 100GB) rather than choosing which topic I prefer and having videos duplicated between the ZIMs (because the topics I like are in fact close to one another and multiple videos are hence present in multiple topics / ZIMs I've chosen).