Open lidel opened 5 years ago
@lidel For the updates, we start to advert and use our OPDS feed (which works like an atom feed). I would recommend to use that in the future. See https://wiki.kiwix.org/wiki/OPDS (still in beta).
@kelson42 thats sounds very useful! what would be a valid query to return the latest snapshot of english or turkish wiki?
Tried https://library.kiwix.org/catalog/search?lang=en&tag=wikipedia
but it points at old snapshot: wikipedia_en_wp1-0.8_orig_2010-12.zim
@lidel This feed delivers the most recent ZIM files... but a few or them are simply not newly generated. Let me know if you find a recent file which is not in it.
@kelson42 I think things like https://github.com/kiwix/kiwix-tools/issues/231 and https://github.com/kiwix/kiwix-tools/issues/316 need to land before we can use OPDS feed.
Right now, I was unable to come up with filters to get the latest English wikipedia with pictures and without video (wikipedia_en_all_novid
)
wikipedia_en_wp1-0.8_orig_2010-12
Looking at https://download.kiwix.org/zim/wikipedia/ directly sounds like more robust solution atm.
Right now, I was unable to come up with filters to get the latest English wikipedia with pictures and without video (wikipedia_en_all_novid)
In my solution I'm using a dynamic parser, which should solve that
@lidel Looks like you have pretty well identified what needs to be done. An alternative would be to rely on https://download.kiwix.org/library/library_zim.xml (is is not dynamic like the OPDS feed, but easier to parse than HTML)... and more robust.
@kelson42 thats sounds very useful! what would be a valid query to return the latest snapshot of english or turkish wiki?
Tried
https://library.kiwix.org/catalog/search?lang=en&tag=wikipedia
but it points at old snapshot:wikipedia_en_wp1-0.8_orig_2010-12.zim
We need to be working of MWDumper.pl and the XML bz2 dataset from Wikipedia ... I will do an export to static HTML and collect the required code again, it's "known working".
I'd like to see more functionality here, we need "search and editing". Afaik there is not yet a good marriage of git or wiki and IPFS and it should be core to ... us.
In the long run, we want to introduce CI/CD automation that does something along these lines:
master
with the new CIDThen, maintainer would review PR and merge it. Updating manifest in
master
would trigger an update of DNSLink under<lang>.wikipedia-on-ipfs.org
, propagating change to collaborative cluster etc.