hatnote / montage

📷 Photo evaluation tool for and by Wiki Loves competitions
https://commons.wikimedia.org/wiki/Commons:Montage
BSD 3-Clause "New" or "Revised" License
37 stars 11 forks source link

All jpg files were ignored during campaign file upload round one creation #234

Open geertivp opened 1 year ago

geertivp commented 1 year ago

The default file type for images on Wikimedia Commons is jpg.

When using the Montage file interface to create the first round of a campaign, all files in the campaign were ignored, because only jpeg is registered in module montage/rdb.py

DEFAULT_ALLOWED_FILETYPES = ['jpeg', 'png', 'gif', 'svg', 'tiff', 'xcf', 'webp'] Problem: This is a blocking error, when using the file interface.

Solution:

This problem did not occur when using a Category upload.

More context:

mahmoud commented 1 year ago

Hey @geertivp! Thanks for this. The DEFAULT_ALLOWED_FILETYPES are not extensions, but actually MIME types (just the minor type, since image is presumed), as used by Commons. See the highlight in this screenshot:

Screenshot from 2023-10-06 10-09-44

I was able to load the images when I loaded it as a "File List", but got failures when trying to load it as a google sheet and CSV. The UI error says that disqualifications were due to round settings, but the server logs show that entries simply weren't loaded (see screenshot of logs below), so there's something else going on.

Screenshot from 2023-10-06 10-35-44

Since you've worked around via Category import, I'll dig into it as time allows. Thanks again for the report!

mahmoud commented 1 year ago

Ah, it just occurred to me, img_name. Another workaround.

If you export your file list as a CSV (basically put quotes around all image filenames, but the easiest/best way is to export from Excel/GSheets), and also make the first row be filename (no quotes), instead of img_name, then upload to https://gist.github.com and use the "Raw URL" (should be a gist.githubusercontent.com URL), then the load will work. (example)

This is very finicky, and we'll have to improve this in the next version, but for now there's another workaround for folks with spreadsheets / long file lists. Thanks again!

geertivp commented 1 year ago

Actually, I was wrong in the file contents sample above. When I initially encountered the problem, I actually loaded a file URL from a webserver with the following content:

img_name
17e-eeuws_Statenjacht_van_Utrecht_Museumschip_Tordino_Plassendale_26-07-2023_11-42-06.jpg
Amel,_Kirche_Sankt_Huberrtus_oeg31027_IMG_7647_2023-08-28_14.48.jpg
Amel-Iveldingen,_de_Sankt_Barbara_Kapelle_oeg31007_IMG_7720_2023-08-29_10.48.jpg
...

(having "_" in the images file names instead of spaces...)