tshrinivasan / OCR4wikisource

OCR for WikiSource using Google Drive OCR
GNU General Public License v2.0
33 stars 24 forks source link

UnicodeDecodeError: 'ascii' codec can't decode byte 0xe0 in position 0: ordinal not in range(128) #24

Closed jayantanth closed 8 years ago

jayantanth commented 8 years ago

mediawiki_uploader_2016-01-05-21-30-04_log.txt jayanta@jayanta-Inspiron-3541:~/OCR$ python mediawiki_uploader.py INFO:main:Running mediawiki_uploader.py Version 1.33 INFO:root:Operating system = "Ubuntu 12.04.5 LTS"

INFO:main:URL = https://upload.wikimedia.org/wikisource/bn/2/2f/Testocrbengali.pdf INFO:main:Columns = 1 INFO:main:Wiki Username = jayantanth INFO:main:Wiki Password = Not logging the password INFO:main:Wiki Source Language Code = bn INFO:main:File Name = Testocrbengali.pdf INFO:main:File Type = pdf INFO:main:Original URL = https://upload.wikimedia.org/wikisource/bn/2/2f/Testocrbengali.pdf INFO:main:Wiki URL = https://bn.wikisource.org/w/api.php INFO:root:Login Status = True INFO:root:

Logged in to https://bn.wikisource.org Traceback (most recent call last): File "mediawiki_uploader.py", line 180, in pagename = filename + "/" + str(convert_to_indic(wikisource_language_code, pageno)) UnicodeDecodeError: 'ascii' codec can't decode byte 0xe0 in position 0: ordinal not in range(128)

tshrinivasan commented 8 years ago

Try now.

jayantanth commented 8 years ago

hurray hurray . All xt file upload to Wikisource page by page successfully. :+1: :+1: :+1:

My test file was small size. Now I am trying to test big file.

bodhisattwawiki commented 8 years ago

yay

tshrinivasan commented 8 years ago

closing this. reopen if you get same issue.