kiwix / operations

Kiwix Kubernetes Cluster
http://charts.k8s.kiwix.org/
7 stars 0 forks source link

How should we run librechef? #262

Open benoit74 opened 1 month ago

benoit74 commented 1 month ago

This is a very open question: how do we expect to run librechef.

librechef is the rice-coocker software responsible to create and update Kolibri channels for libretexts.

Currently librechef is a Python project, not released, not published to PyPI, not published to Docker.

Since it is now almost clear that it will become a part of our toolchain to update libretext ZIMs, how do we expect to run this?

The "difficulty" is that we probably want to run librechef just before running kolibri scraper, i.e. should we package both in a single Docker image? I feel like it is a bit weird since they are two very different projects.

But the same difficulty applies to other Kolibri channels (but they are supposed to be updated by learning equalities, even if it does not seems to be done regularly at all - i.e. we are waiting for African Story Books channel updates for months, and it is still not done AFAIK).

kelson42 commented 1 month ago

The approach we follow with SE (downloading the archives in a dedicated and totally different process) seems the most appropriate to me.

benoit74 commented 1 month ago

IMO this is very different because for SE we are only updating a cache of SE dumps. The process is ran daily, mostly quick (very quick usually because there is no update, only moving files otherwise so a matter of few hours and only bandwidth limited, mostly no CPU / memory impact, limited disk impact), and only target at refreshing the cache.

Here we speak about content update, which cannot be ran daily, needs to be done "in sync" with kolibri recipe on the farm, will consume both CPU, memory and disk. So I don't really get how comparable this is.

Somehow I can rephrase my question as "how do we sync librechef to update kolibri channel with kolibri to update the ZIM?".

Anyway we might not need to solve this issue in the end since we are currently following the path of creating a custom scraper, dropping librechef.