Open desb42 opened 5 years ago
I suspect that xowa is using the 2005 version
Yeah, that's pretty much correct. I only pull down an image once, and if it changes, well I have to re-pull it down again. Part of the problem is that I didn't extract image date in the early builds. I think I can do it now, but it's already too late (I have 4+ million images which aren't dated)
Don't know of an easy way except to redownload images. And last time, that took about 30 days (if I remember correctly)...
It could be done by giving all the unknown timestamps a timestamp of now. Any future builds check the File: timestamp and if later, does an update (pull) (This does not guarantee that the image has changed, only that some edit has occurred to the File)
Retrospectively, over time a refresh of the images could be done
Yup, agreed here too. Added it to #422
I was comparing en.wikipedia.org/wiki/Ursula_K._Le_Guin#Life (xowa http_server and mediawiki)
and was struck by the significant visual difference between the images Looking on commons.wikimedia.org/wiki/File:Ishi.jpg There are two versions of the file
I suspect that xowa is using the 2005 version
I also suspect that given the desire to reuse the images already downloaded during a refresh build, there is no check to see if there is a newer version.
I have tried rummaging around in http://dumps.wikimedia.your.org/commonswiki to see if I can find some specific dump file that might help. The closest I got was stub-meta-history, this gives all the revisions to all the pages. It does not directly show whether a file has changed, just that a change has occurred.
I know the API could do it, but hitting that interface with 1,000,000 requests seems OTT