Open nicolas-raoul opened 6 years ago
I have this problem often.
Using the imker-gui.jar
(equally 16.09.13) for the first time, I join this observation. To ease replication / bug-fixing, this was my procedure:
SVG Deutsche Einheitskurzschrift
close to the bottom here, which is part of collection of 20.6k entries (root entry)at wiki.Wiki.fetch(Unknown Source) at wiki.Wiki.getImage(Unknown Source) at wiki.Wiki.getImage(Unknown Source) at app.ImkerBase$1.fetch(Unknown Source) at app.App.attemptFetch(Unknown Source) at app.ImkerBase.downloadLoop(Unknown Source) at app.ImkerGUI$4.doInBackground(Unknown Source) at app.ImkerGUI$4.doInBackground(Unknown Source) at java.desktop/javax.swing.SwingWorker$1.call(SwingWorker.java:304) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.desktop/javax.swing.SwingWorker.run(SwingWorker.java:343) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:834)
Thus, a feature suggest: Let Imker write a permanent list of the files to download which a) the program may use if for whatever reason the batch was not yet completed. Which b) may be used by an explicit indication by the user, e.g. a quarter of a year later, to collect media in the same category which were added since the last survey, lowering the traffic neccessary.
added:
With Wikimedia's own list generator such a listing may be created (even split into multiple files, too). Character encoding (e.g., Umlauts) occasionally may be an issue Imker
in the files downloaded did not show, though.
@nicolas-raoul Translating «Kategorie» to category, and «Anzahl der Listen» into number of lists to generate is one thing. While unlikely to be exhaustive, the little list mentioned taught me the following substitution rules between «safe for internet / pure ASCII (maybe even 7 bit?)» and special characters the uploaders may use in the file names.
|-------------------------------+-----------------------------------------|
| code -> substitute (keyed as) | example |
|-------------------------------+-----------------------------------------|
| %C3%A4 -> ä ("a) | Kläranlage ([water] purification plant) |
| %C3%B6 -> ö ("o) | öffentlich (public, adjective) |
| %C3%BC -> ü ("u) | Bürger (citizen) |
| %C3%9F -> ß ("s, or Alt + s) | Kuß (kiss, noun) |
| %C3%AE -> î (^i) | maître (master, noun) |
| %C3%A9 -> é ('e) | école (school) |
|-------------------------------+-----------------------------------------|
| %C3%84 -> Ä ("A) | Ärmelkanal (the British channel) |
| %C3%96 -> Ö ("O) | Öffentlichkeit (public, noun) |
| %C3%9C -> Ü ("U) | Überraschung (surprise, noun) |
|-------------------------------+-----------------------------------------|
| %2C -> , | (comma) |
| %21 -> ! | (exclamation mark) |
| %27 -> ' | (apostrophe) |
| %28 -> ( | (opening parenthesis) |
| %29 -> ) | (closing parenthesis) |
|-------------------------------+-----------------------------------------|
This gives a good reason to watch out for proper character encoding. And well, the third group (again, comme le 3e group) is the more tricky one I did not expect to see there as permitted.
Version: v16.09.13 Stack trace:
That happened after 1117 files (out of many more) got downloaded. 3 seconds of lagging does not sound like a very serious problem that would requires to abort the whole category download.