Closed hugolpz closed 2 years ago
@kanasimi posted:
Hi. Maybe you will be interesting in
const file_data_list = await wiki.download('Category:name', { directory: './' });
The conceptions are implemented now, I am using generator now. The codes will do as less calls as possible. You may try it yourself. You may try it yourself and find the codes at https://github.com/kanasimi/CeJS/blob/master/application/net/wiki/page.js#L3893
.download()
benchmarkUpdate attempt:
@kanasimi : Local comparisons without api.php calls are 200+ times faster ! 🙀 😮 👩🏼🚀 🚀 UPDATE IS DAMN FASTER INDEED !!! 🎆
Well, I think this issue is solved.
200 times faster 😻🙉🤤
Timestamp
Timestamp property could be used to compare with existing local file's timestamp. If API timestamp property is smaller (older) than local file timestamp, then skip download. The imageinfo's
"timestamp": "2021-04-25T15:49:00Z"
indeed matches file description page indicating :After verification, files with several uploads provide by default the timestamp of the last upload (default : 1 revision, the latest).
Q4-related (✅ #51)
When a file already exists locally, it could be skipped faster. Given
a
the time per download,x
the number of files to download,b
the initial categorymember query time with estimatedb=60sec
. We could get the second attempt (update) duration to be such as2.7*14+60 = 97.8secs
instead of 540 sec.Q5-related (✅ #51)
@Poslovitch pointed out that filenames are not enough, some versioning check may be ongoing so recently updated files on commons are indeed re-downloaded. (Discord server invitation) Ciencia-Al-Poder pointed out "First of all, you should avoid redownload files that you downloaded on a previous run. The api will return you the file modification/creation time. Use it to check if the file has been updated." (Discord link)