Open antoine2711 opened 6 months ago
Also, I have this error message that is not relayed to the front-end, but even if it was, I don't understand what’s the problem with the file name…
16:32:31.558 [..ting.EditBatchProcessor] MediaWiki error while editing [Warning]: The file upload action returned the 'Warning' error code. Warnings are: {badfilename="ML_Carton_d'invitation_pour_«_Bastien_et_Bastienne_».jpg"} (69690ms)
16:32:42.236 [..ting.EditBatchProcessor] MediaWiki error while editing [Warning]: The file upload action returned the 'Warning' error code. Warnings are: {badfilename="ML_Marionnettes_de_«_La_Boîte_à_joujoux_»_dans_les_décors.jpg"} (10678ms)
16:32:53.609 [..ting.EditBatchProcessor] MediaWiki error while editing [Warning]: The file upload action returned the 'Warning' error code. Warnings are: {badfilename="ML_Programme_de_«_Bastien_et_Bastienne_»_de_Jacques_Chesnais.jpg"} (11373ms)
16:33:00.627 [..ting.EditBatchProcessor] MediaWiki error while editing [Warning]: The file upload action returned the 'Warning' error code. Warnings are: {badfilename="ML_Programme_de_«_Les_Comédiens_de_bois_»_de_Jacques_Chesnais,_argument_de_la_pièce.jpg"} (7018ms)
16:33:03.326 [..ting.EditBatchProcessor] MediaWiki error while editing [Warning]: The file upload action returned the 'Warning' error code. Warnings are: {badfilename="ML_Programme_de_«_Les_Comédiens_de_bois_»_de_Jacques_Chesnais,_distribution_des_rôles.jpg"} (2699ms)
16:33:04.473 [..ting.EditBatchProcessor] MediaWiki error while editing [Warning]: The file upload action returned the 'Warning' error code. Warnings are: {badfilename="ML_Manipulateurs_de_«_La_Boîte_à_joujoux_».jpg"} (1147ms)
16:33:19.697 [..ting.EditBatchProcessor] MediaWiki error while editing [Warning]: The file upload action returned the 'Warning' error code. Warnings are: {badfilename="ML_Affiche_de_«_Tintin_et_le_Temple_du_Soleil_»_en_anglais.jpg"} (15224ms)
16:33:41.356 [..ting.EditBatchProcessor] MediaWiki error while editing [Warning]: The file upload action returned the 'Warning' error code. Warnings are: {badfilename="ML_«_Tintin_et_le_Temple_du_Soleil_»_de_Micheline_Legendre,_pour_Marionnettes_en_vitrine.jpg"} (21659ms)
16:33:57.972 [..ting.EditBatchProcessor] MediaWiki error while editing [Warning]: The file upload action returned the 'Warning' error code. Warnings are: {badfilename="ML_«_Tintin_et_le_Temple_du_Soleil_»_de_Micheline_Legendre_pour_Marionnettes_en_vitrine_(2).jpg"} (16616ms)
16:34:09.691 [..ting.EditBatchProcessor] MediaWiki error while editing [Warning]: The file upload action returned the 'Warning' error code. Warnings are: {badfilename="ML_Micheline_Legendre_avec_la_rose_du_«_Petit_Prince_».png"} (11719ms)
16:34:13.316 [..ting.EditBatchProcessor] Requesting documents (3625ms)
16:34:14.478 [..ting.EditBatchProcessor] MediaWiki error while editing [Warning]: The file upload action returned the 'Warning' error code. Warnings are: {badfilename="ML_Le_Théâtre_«_Tintin_»_au_Parc_Lafontaine.jpg"} (1162ms)
16:34:15.432 [..ting.EditBatchProcessor] MediaWiki error while editing [Warning]: The file upload action returned the 'Warning' error code. Warnings are: {badfilename="ML_Vitrine_de_Noël_de_«_Tintin_au_Tibet_»_1964.jpg"} (954ms)
16:34:19.638 [..ting.EditBatchProcessor] MediaWiki error while editing [Warning]: The file upload action returned the 'Warning' error code. Warnings are: {badfilename="ML_«_Tintin_au_Tibet_»_au_Jardin_des_merveilles.jpg"} (4206ms)
16:34:30.142 [..ting.EditBatchProcessor] MediaWiki error while editing [Warning]: The file upload action returned the 'Warning' error code. Warnings are: {badfilename="ML_«_Hansel_et_Gretel_»_à_Stratford.jpg"} (10504ms)
That being written, intuitively, Wikimedia Commons doesn't seems to like chevrons («
& »
) in the name… ;-)
Regards, Antoine
So, for the second problem, I figured out. It was the use of non-breakable spaces that are often used in French. I think OR should warn the user about that…
Regards, Antoine
Commons disallows non-printing characters in the filename. IIRC the validation of this is all handled in FileNameScrutinizer which reflects the default values of wgLegalTitleChars. [but it's not a very transparent regexp]
That doesn't undermine the problem raised here of surfacing the errors in the frontend =)
The problem of surfacing the errors in the frontend should be addressed by https://github.com/OpenRefine/OpenRefine/pull/6555, although I did not test it specifically for media files upload. I wonder if @antoine2711 or @Vesihiisi would be interested in trying it out?
@lokal-profil @antoine2711 I have tested my PR https://github.com/OpenRefine/OpenRefine/pull/6555 with Commons upload and made some tweaks to improve the UX there.
See the screenshots there. Any feedback welcome.
@lokal-profil @antoine2711 I have tested my PR OpenRefine/OpenRefine#6555 with Commons upload and made some tweaks to improve the UX there.
I’m waiting for the next version that can load the Commons extension.
Regards, Antoine
@antoine2711 there is a new release for the Commons extension which should work with OpenRefine 3.8 and the development version of OpenRefine (master branch)
I took the latest OR snapshot (#2442) and tried uploading a file with a tab (0x09) in the name.
Some thoughts on the experience:
I was still able to proceed to the upload stage, even though OR knows the filename won't be accepted by Commons. I was only uploading this one file, so ideally I shouldn't be able to start the upload at all.
I appreciate the new "Wikibase editing results" column, but the content is not helpful for inexperienced users.
[Warning] The file upload action returned the 'Warning' error code. Warnings are: {badfilename="Skövde_stadsbibliotek_interior-01.jpg"}
Again, if I didn't know about non-printable characters, I wouldn't be able to guess the reason for the error. I guess that's the raw error returned by the API.
I think the pain point is the fact that I was allowed to start uploading the file in the first place.
We used to have this warning with the highest severity level ("Critical") which prevents the user from doing the upload, but because our regular expression catching invalid characters had false positives (flagging characters which were actually allowed, https://github.com/OpenRefine/OpenRefine/issues/5656) we changed it to "Warning" so that the user is still able to attempt the upload (https://github.com/OpenRefine/OpenRefine/pull/6227).
We can of course revert this move, or somehow find a more reliable source of information for which characters are allowed in Commons filenames.
Highlighting the special characters (such as your tab character) makes sense in any case.
I appreciate the new "Wikibase editing results" column, but the content is not helpful for inexperienced users.
The idea is that it's at least something they can include in their report when asking for help (without having to check the server logs). If you have ideas of how to improve it, I am all ears.
We could add some logic to translate specific MediaWiki error messages to a different format so they can be more easily understood by the user, but aiming to cover all possible MediaWiki errors is beyond reach I would say.
@wetneb Found at the end of https://commons.wikimedia.org/wiki/Commons:File_naming#Language-specific_guidelines
Avoid abusing Unicode. Control characters can be omitted, strange punctuation can be replaced with standard quotes and commas, and symbols such as "♥" are often more natural when spelled out ("heart"), also increasing visibility in search. Furthermore some characters do not render correctly at all in certain operating systems and browsers. It is a good idea to stick to letters, numbers, underscore (space), ASCII hyphen/minus/dash, plus, and period (dot), as these do not have any MediaWiki restrictions. Letters with diacritics and accents are acceptable, but so is omitting diacritics and accents (e.g. "Calderón"/"Calderon", "Erdoğan"/"Erdogan").
Looks like MediaWiki itself has restrictions on filenames as seen in the paragraph above. But hard to find out WHICH and WHERE... Found these as well:
Since Commons uses the same underlying technology as Mediawiki itself... I read that it sometimes depends on which extension is actually used that enables a mass upload API seems important, but more important seems to be the backend database chosen where the last line of filename technical restrictions lies from what I read on the Mediawiki file uploads page?
But I think this is close to the right place in their source (someone might have to ask on Telegram): https://github.com/wikimedia/mediawiki/blob/d38689ae1d7a74cda9df88d9e747b455b66653d6/includes/api/ApiUpload.php#L826
But @Vesihiisi is actually getting the badfilename
error which is checked here:
https://github.com/wikimedia/mediawiki/blob/d38689ae1d7a74cda9df88d9e747b455b66653d6/includes/upload/UploadBase.php#L806
Using that and poking around more, brought me to this page: https://www.mediawiki.org/wiki/API:Upload Where there I found this:
badfilename: The file name supplied is not acceptable on this wiki, for instance because it contains forbidden characters
The Gerrit issue https://gerrit.wikimedia.org/r/c/mediawiki/core/+/942710 where the configuration options for the forbidden characters were deprecated in Mediawiki 1.41+ has some interesting reading about Illegal File Chars, and links to the wikitech-l mailing list issue which is very interesting reading and points to a core problem: https://lists.wikimedia.org/hyperkitty/list/wikitech-l@lists.wikimedia.org/thread/ASODV6622T4YUAY3JO5ZVBL3B5ZQDX2U/
I have this error when uploading:
But I get nothing in the front-end.
Regards, Antoine