reallistic / BitcasaFileLister

List and download files in your bitcasa drive via api
https://rose-llc.com/bitcasafilelist/
16 stars 3 forks source link

'ascii' codec can't encode character u'\xe4' in #43

Closed ghost closed 10 years ago

ghost commented 10 years ago

11/10 11:27:16 [Queuer 26][ERROR]: Error. Could not write to d:\temp\bitcasa\skippedfiles.csv. Ending Traceback (most recent call last): File "C:\Users\root\Desktop\BitcasaFileLister-master\includes\helpers\results.py", line 24, in writeSkipped myfile.write("%s||%s\n" % (tfd, base64_path)) UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 11: ordinal not in range(128)

It's a showstopper. After that error all Threads are quitted.

ghost commented 10 years ago

Exception in thread Download 32: Traceback (most recent call last): File "C:\Python27\lib\threading.py", line 810, in __bootstrap_inner self.run() File "C:\Python27\lib\threading.py", line 763, in run self.__target(*self.__args, **self.__kwargs) File "BitcasaFileFetcher\threads\download.py", line 60, in download filehash = sha1("blob " + str(size_bytes) + "\0" + temp_file) UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 71: ordinal not in range(128)

ekoekoazarak commented 10 years ago

When it was using unidecode, all characters were in asii. But now it can have all characters in system locale. And afaik sha1 function does not handle unicode string - only takes strs and bytes.

reallistic commented 10 years ago

WIP

reallistic commented 10 years ago

This should be fixed by 80182954a4b037e84250349b3536f0144ca00bdd

If not, please ask to reopen and be sure to include the characters that couldn't be converted. Also, due to the fix for sha1 hashes, any partial downloads will NOT be recognized

reallistic commented 10 years ago

Just in case please see the important edit above.

ghost commented 10 years ago

11/10 18:48:34 [Queuer 71][ERROR]: Error. Could not write to d:\temp\bitcasa\skippedfiles.csv. Ending Traceback (most recent call last): File "C:\Users\root\Desktop\BitcasaFileLister-master\includes\helpers\results.py", line 24, in writeSkipped myfile.write("%s||%s\n" % (tfd, base64_path)) UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 11: ordinal not in range(128) 11/10 18:48:34 [Queuer 71][INFO]: Shutting down 11/10 18:48:34 [Queuer 54][INFO]: Shutting down 11/10 18:48:34 [Queuer 47][INFO]: Shutting down 11/10 18:48:34 [Queuer 45][INFO]: Shutting down

Version 0.6.1

reallistic commented 10 years ago

And I specifically tested that character. Weird. Can you provide the filename?

ghost commented 10 years ago

Käpt'n Blaubär Seemannsgarn

The ' seems to be the problem :-|

reallistic commented 10 years ago

Super weird. I will check it out

ghost commented 10 years ago

As far as I know its a little bit tricky to escape ' in python :-D

reallistic commented 10 years ago

Wow. i just had an epiphany. Yeah I know how to fix that. Its already fixed actually in one place but I didn't carry it over to others.

ghost commented 10 years ago

I'm really looking forward for fixing that.

The "resume Downloads" option is the last hope for me to save all of my Files.

reallistic commented 10 years ago

In case anyone is interested in an explanation: Bitcasa is already sending valid utf-8 (or so I assume). json.loads(response) turns that into unicode (how about that. Was totally unaware). This means, I was dancing in circles trying to convert what I thought was a bytestring to unicode to convert back to a "valid" bytestring when all along all I had to do was use codecs.open to write the unicode directly to the error/skipped/success files with no conversion necessary.

Update coming soon

reallistic commented 10 years ago

@cappa83 @traumender @SoySoy22 Have any of you had issues with the results.py writing success/skipped/failures in this version?

ghost commented 10 years ago

Nop, for me it works flawlessly...

reallistic commented 10 years ago

Great! I'll call this 1 closed as fixed by d5f7407623dfa19b90a34665233595b68890ac85.