zotero / translators

Zotero Translators
http://www.zotero.org/support/dev/translators
1.29k stars 761 forks source link

Wikimedia Commons: Image file doesn't attach #3136

Open adam3smith opened 1 year ago

adam3smith commented 1 year ago

Example: https://commons.wikimedia.org/wiki/File:Baryon_decuplet.png Reported: https://forums.zotero.org/discussion/107766/unable-to-download-file-from-wikiedia-commons#latest

To be clear: the metadata imports fine, as does the snapshot, by the translator is supposed to also download the image on the page, which it doesn't anymore.

I haven't looked at this in any more detail, but wouldn't be surprising if Commons changed how image files are made available.

AbeJellinek commented 1 year ago

@zoe-translates: Apparently this translator still uses FW - any interest in rewriting?

zoe-translates commented 1 year ago

This is the most recently-updated translator that still relies on FW. I'm going to stop that :)

One question about downloading the images themselves -

Some files hosted by the Wikimedia Commons could be super large. Should we impose an upper-limit on the attachment's file size, above which the file is saved as a link (snapshot: false) rather than a download?

AbeJellinek commented 1 year ago

Should we impose an upper-limit on the attachment's file size, above which the file is saved as a link (snapshot: false) rather than a download?

Sure. 10 MB?

zoe-translates commented 1 year ago

In fact, the reported problem was caused by the xpath here (and the second appearance on line 121:

https://github.com/zotero/translators/blob/8e5c648bb1e2ec58eccbc5b04ab5d1e1e656afdc/Wikimedia%20Commons.js#L85-L88

The xpath //div[@class="fullMedia"]//a[@class="internal"] would have matched the a tag.

However there are a lot more problems with this translator that I'm unwilling to submit just this change of xpath as a hotfix (it won't actually "fix" anything, because the other metadata fields are still broken, and the attachment itself misses mimeType).

So I think this is an opportunity to re-write the translator.

zoe-translates commented 1 year ago

Todo: