alexz-enwp / wikitools

Python package for working with MediaWiki wikis
105 stars 51 forks source link

How to fix "The supplied MD5 hash was incorrect" error? #40

Closed tshrinivasan closed 8 years ago

tshrinivasan commented 8 years ago

Getting the following error while uploading a PDF file to https://commons.wikimedia.org

Logged in. Uploading the file পদ্মাপুরাণ - নারায়ণ দেব.pdf Traceback (most recent call last): File "local.py", line 140, in upload_pdf_file(pdf_file) File "local.py", line 132, in upload_pdf_file page.edit(text=wikidata) File "/usr/local/lib/python2.7/dist-packages/wikitools/page.py", line 623, in edit result = req.query() File "/usr/local/lib/python2.7/dist-packages/wikitools/api.py", line 165, in query raise APIError(data['error']['code'], data['error']['info']) wikitools.api.APIError: (u'badmd5', u'The supplied MD5 hash was incorrect')

It is reported here https://github.com/tshrinivasan/tools-for-wiki/issues/17

What may the reason for this? How can we solve this?

mzmcbride commented 8 years ago

What code are you using?

The "edit" method in page.py has a "skipmd5" argument that you can pass; cf. https://github.com/alexz-enwp/wikitools/blob/63a967b4/wikitools/page.py#L568.

tshrinivasan commented 8 years ago

The code we use is here https://github.com/tshrinivasan/tools-for-wiki/blob/master/pdf-upload-commons/pdf-djvu-uploader-commons.py

Will check the fix and share the results soon.

alexz-enwp commented 8 years ago

Using skipmd5 will work around the problem here.

I don't think this is a problem with wikitools. It looks like a bug with MediaWiki. The issue is the character য়. Python/wikitools is correctly encoding it as e0a79f. But by the time it gets to the MD5 check in the MW API, it's been converted to য+ ়, which is equivalent but is 2 separate characters, e0a6af+e0a6bc and will hash differently. It looks like it's correctly encoded in the PHP $_POST variable, so the problem is in MediaWiki somewhere.

Filed upstream - https://phabricator.wikimedia.org/T144071

alexz-enwp commented 8 years ago

The change made in 8f7a4f2decba16e89c4b5d9f0a8a7e50d39debc0 should fix this issue.