salimoha / googlecl

Automatically exported from code.google.com/p/googlecl
0 stars 0 forks source link

UnicodeDecodeError: 'utf8' codec / 'ascii' codec can't decode byte(s) #195

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
*What steps will reproduce the problem?*
1. "Python.exe google youtube post -n "текст заголовка" -s 
"текст описания" -t "tag1" -c News file.mp4"

*What is the expected output? What do you see instead?*

I tried to use non-latin characters with Youtube post task in Windows XP.

Instead of this I see output error:

Loading file.mp4
Traceback (most recent call last):
  File "google", line 463, in <module>
    main()
  File "google", line 457, in main
    run_once(options, args)
  File "google", line 356, in run_once
    task.run(client, options, args)
  File "C:\Python26\lib\site-packages\googlecl\youtube\service.py", line 217, in _run_post
    tags=options.tags, category=options.category)
  File "C:\Python26\lib\site-packages\googlecl\youtube\service.py", line 129, in post_videos
    self.InsertVideoEntry(video_entry, path)
  File "C:\Python26\lib\site-packages\gdata\youtube\service.py", line 654, in InsertVideoEntry
    converter=gdata.youtube.YouTubeVideoEntryFromString)
  File "C:\Python26\lib\site-packages\gdata\service.py", line 1236, in Post
    media_source=media_source, converter=converter)
  File "C:\Python26\lib\site-packages\gdata\service.py", line 1286, in PostOrPut
    data_str = str(data)
  File "C:\Python26\lib\site-packages\atom\__init__.py", line 377, in __str__
    return self.ToString()
  File "C:\Python26\lib\site-packages\atom\__init__.py", line 374, in ToString
    return ElementTree.tostring(self._ToElementTree(), encoding=string_encoding)
  File "C:\Python26\lib\site-packages\atom\__init__.py", line 369, in _ToElementTree
    self._AddMembersToElementTree(new_tree)
  File "C:\Python26\lib\site-packages\atom\__init__.py", line 331, in _AddMembersToElementTree
    member._BecomeChildElement(tree)
  File "C:\Python26\lib\site-packages\atom\__init__.py", line 357, in _BecomeChildElement
    self._AddMembersToElementTree(new_child)
  File "C:\Python26\lib\site-packages\atom\__init__.py", line 331, in _AddMembersToElementTree
    member._BecomeChildElement(tree)
  File "C:\Python26\lib\site-packages\atom\__init__.py", line 357, in _BecomeChildElement
    self._AddMembersToElementTree(new_child)
  File "C:\Python26\lib\site-packages\atom\__init__.py", line 342, in _AddMembersToElementTree
    ExtensionContainer._AddMembersToElementTree(self, tree)
  File "C:\Python26\lib\site-packages\atom\__init__.py", line 224, in _AddMembersToElementTree
    tree.text = self.text.decode(MEMBER_STRING_ENCODING)
  File "C:\Python26\lib\encodings\utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 0-3: invalid 
data

*What version of the product are you using? On what operating system? What
version of gdata-python-client (aka python-gdata)?*

googlecl-0.9.7.tar.gz
python-2.6.5.msi
gdata-2.0.10.zip
Windows XP SP3

Original issue reported on code.google.com by spv60582 on 30 Jun 2010 at 6:19

GoogleCodeExporter commented 9 years ago
Issue 188 has been merged into this issue.

Original comment by tom.h.mi...@gmail.com on 24 Jul 2010 at 2:06

GoogleCodeExporter commented 9 years ago
You mentioned that this worked in Cygwin in Issue 188, so the issue seems to be 
tied to the shell. I'm not sure how to solve this, but I don't think additional 
code in GoogleCL can help.

I'll leave this issue here as open in case someone has the same problem and 
figures out how to fix it.

Original comment by tom.h.mi...@gmail.com on 24 Jul 2010 at 2:11

GoogleCodeExporter commented 9 years ago
Same problem in my configuration ...

C:\Dokumente und Einstellungen\Sascha.Gibson\Eigene Dateien\Eigene Bilder>google
 picasa post -n "testest" Lesekoenig.jpg
Loading file Lesekoenig.jpg to album testest

C:\Dokumente und Einstellungen\Sascha.Gibson\Eigene Dateien\Eigene Bilder>google
 picasa post -n "testest" Lesekönig.jpg
Loading file Lesek÷nig.jpg to album testest
Traceback (most recent call last):
  File "google", line 536, in <module>
  File "google", line 530, in main
  File "google", line 408, in run_once
  File "googlecl\picasa\service.pyo", line 333, in _run_post
  File "googlecl\picasa\service.pyo", line 206, in insert_photo_list
  File "gdata\photos\service.pyo", line 469, in InsertPhotoSimple
  File "gdata\photos\service.pyo", line 425, in InsertPhoto
  File "gdata\service.pyo", line 1236, in Post
  File "gdata\service.pyo", line 1286, in PostOrPut
  File "atom\__init__.pyo", line 377, in __str__
  File "atom\__init__.pyo", line 374, in ToString
  File "atom\__init__.pyo", line 369, in _ToElementTree
  File "atom\__init__.pyo", line 331, in _AddMembersToElementTree
  File "atom\__init__.pyo", line 357, in _BecomeChildElement
  File "atom\__init__.pyo", line 342, in _AddMembersToElementTree
  File "atom\__init__.pyo", line 224, in _AddMembersToElementTree
  File "encodings\utf_8.pyo", line 16, in decode
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 5-8: invalid dat
a

C:\Dokumente und Einstellungen\Sascha.Gibson\Eigene Dateien\Eigene Bilder>

Original comment by sascha.g...@gmail.com on 26 Jul 2010 at 1:00

GoogleCodeExporter commented 9 years ago
Issue 272 has been merged into this issue.

Original comment by tom.h.mi...@gmail.com on 1 Sep 2010 at 7:16

GoogleCodeExporter commented 9 years ago
I have a hunch. Try applying the attached patch, and let me know if the error 
disappears/changes.

The patch is (theoretically) decoding everything you enter, which (should) 
allow the atom module to decode with utf-8. I don't know if this is safe, so 
I'm not sure if this patch will make it into the trunk or not.

I am 99.99% sure that if you use a unicode-friendly shell / terminal program 
(try cygwin for windows), this problem will go away without using the patch.

Please report back here with details on how the patch or another shell worked. 
Or, for mega brownie points, try both. Thanks a lot!

Original comment by tom.h.mi...@gmail.com on 1 Sep 2010 at 8:01

Attachments:

GoogleCodeExporter commented 9 years ago
FYI, this patch should apply successfully to 0.9.9 and the version in the 
trunk. Not sure about <=0.9.8

Original comment by tom.h.mi...@gmail.com on 1 Sep 2010 at 8:01

GoogleCodeExporter commented 9 years ago
I applied the patch and now it takes unicode data with urxvt.

A question slightly off topic: in italian sentences like "tomorrow at noon" or 
"monday at 9 pm" do not work (the event is created `now`) nor the local 
translation "domani a mezzogiorno", "lunedì alle 21" do work. Where can I find 
a reference for recognized words?

Original comment by neur...@gmail.com on 1 Sep 2010 at 9:40

GoogleCodeExporter commented 9 years ago
Now, after decoding user input you should encode output...

>>> sys.stdout.encoding
'ISO-8859-15'

Thank you

Original comment by neur...@gmail.com on 2 Sep 2010 at 8:09

GoogleCodeExporter commented 9 years ago
The problem with adding events in Italian seems to be rooted in the calendar 
service itself. See Issue 211. I'm not sure if there are any resources on 
quick-add (which is what GoogleCL uses) in a non-english language.

I've asked the python mailing list if it's safe to blanket-decode command line 
arguments. The more I think about it, the safer it seems. But yes, encoding 
output is the next step for this issue.

Thanks for reporting back!

Original comment by tom.h.mi...@gmail.com on 2 Sep 2010 at 1:28

GoogleCodeExporter commented 9 years ago
Ok, I've read Issue 211 and really seems quick add is disabled if Calendar 
language is not English: I restored it and now, for example, "lunch with tony 
saturday at 1 pm" (or "at 13") works.

Thank you and keep up the good work!

Original comment by neur...@gmail.com on 2 Sep 2010 at 1:40

GoogleCodeExporter commented 9 years ago
Non-latin input and output should be working in 0.9.10. Report back on this 
thread if you find any UnicodeEncodeErrors or UnicodeDecodeErrors.

Original comment by tom.h.mi...@gmail.com on 3 Sep 2010 at 9:33

GoogleCodeExporter commented 9 years ago
I still have problems with v0.9.10. I had the non-latin character ß in a 
appointment today and because of this "$google calendar today" fails with this 
output:

Traceback (most recent call last):
  File "/usr/bin/google", line 676, in <module>
    main()
  File "/usr/bin/google", line 662, in main
    run_once(options, args)
  File "/usr/bin/google", line 504, in run_once
    task.run(client, options, args)
  File "/usr/lib/pymodules/python2.6/googlecl/calendar/service.py", line 495, in _run_list_today
    _list(client, options, args, date)
  File "/usr/lib/pymodules/python2.6/googlecl/calendar/service.py", line 470, in _list
    delimiter=options.delimiter)
  File "/usr/lib/pymodules/python2.6/googlecl/base.py", line 603, in compile_entry_string
    return_string += val.replace(delimiter, ' ') + delimiter
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 2: ordinal 
not in range(128)

Original comment by robin.ru...@gmail.com on 8 Sep 2010 at 1:07

GoogleCodeExporter commented 9 years ago
Can you try the patch in Issue 279? That should clear up the issue.

Original comment by tom.h.mi...@gmail.com on 8 Sep 2010 at 1:20

GoogleCodeExporter commented 9 years ago
Issue 279 has been merged into this issue.

Original comment by tom.h.mi...@gmail.com on 8 Sep 2010 at 1:20

GoogleCodeExporter commented 9 years ago
Everyone: the patch in Issue 279, applied to 0.9.10, should solve most 
(hopefully all)  of these decode / encode errors. But if not, let me know 
through this issue.

Original comment by tom.h.mi...@gmail.com on 8 Sep 2010 at 1:33

GoogleCodeExporter commented 9 years ago
I've installed the 0.9.10 release on my ubuntu 10.04 and I'm still having 
encoding/decoding problems.
Note: My contacts names contained accentuated characters.

Traceback (most recent call last):
  File "/usr/bin/google", line 681, in <module>
    main()
  File "/usr/bin/google", line 667, in main
    run_once(options, args)
  File "/usr/bin/google", line 509, in run_once
    task.run(client, options, args)
  File "/usr/lib/pymodules/python2.6/googlecl/contacts/base.py", line 203, in _run_list
    delimiter=options.delimiter)
  File "/usr/lib/pymodules/python2.6/googlecl/base.py", line 603, in compile_entry_string
    return_string += val.replace(delimiter, ' ') + delimiter
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1: ordinal 
not in range(128)

I've applied the provided patch but without success.

Original comment by jeremy.c...@gmail.com on 11 Sep 2010 at 2:32

GoogleCodeExporter commented 9 years ago
I also have the same problem.

$ google calendar list

[ludovic.rousseau@gmail.com]
Traceback (most recent call last):
  File "/usr/local/bin/google", line 5, in <module>
    pkg_resources.run_script('googlecl==0.9.10', 'google')
  File "/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/python/pkg_resources.py", line 442, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/python/pkg_resources.py", line 1167, in run_script
    exec script_code in namespace, namespace
  File "/Library/Python/2.6/site-packages/googlecl-0.9.10-py2.6.egg/EGG-INFO/scripts/google", line 676, in <module>

  File "/Library/Python/2.6/site-packages/googlecl-0.9.10-py2.6.egg/EGG-INFO/scripts/google", line 662, in main

  File "/Library/Python/2.6/site-packages/googlecl-0.9.10-py2.6.egg/EGG-INFO/scripts/google", line 504, in run_once

  File "build/bdist.macosx-10.6-universal/egg/googlecl/calendar/service.py", line 490, in _run_list
  File "build/bdist.macosx-10.6-universal/egg/googlecl/calendar/service.py", line 470, in _list
  File "build/bdist.macosx-10.6-universal/egg/googlecl/base.py", line 603, in compile_entry_string
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1: ordinal 
not in range(128)

I do not have the problem with googlecl-0.9.9. So it is a regression in 0.9.10 
for me.

The bug is triggered by an event named "Férié" so using non-ASCII characters.

Original comment by ludovic....@gmail.com on 12 Sep 2010 at 6:30

GoogleCodeExporter commented 9 years ago
Jeremy and Ludovic, you've definitely, successfully, applied the patch in Issue 
279? Because there should be absolutely no mention of the ascii codec if the 
patch was applied.

Original comment by tom.h.mi...@gmail.com on 12 Sep 2010 at 7:11

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
Issue 298 has been merged into this issue.

Original comment by tom.h.mi...@gmail.com on 28 Sep 2010 at 8:34

GoogleCodeExporter commented 9 years ago
Issue 305 has been merged into this issue.

Original comment by thmil...@google.com on 4 Oct 2010 at 7:46

GoogleCodeExporter commented 9 years ago
Alright, 0.9.11 should fix this issue for good (meaning, only a few strange 
edge cases left to cover). Let me know if problems keep cropping up even after 
the upgrade.

Original comment by thmil...@google.com on 9 Oct 2010 at 8:18

GoogleCodeExporter commented 9 years ago
I'm getting a UnicodeDecodeError with 0.9.11 on Ubuntu 10.04.1 LTS attempting 
to upload a file to Docs:

$ google docs upload --no-convert --folder "Netværk" pdf/contex-allerød.pdf 
No supported filetype found for extension pdf
Uploading as text/plain
Loading pdf/contex-allerød.pdf
Traceback (most recent call last):
  File "/usr/bin/google", line 812, in <module>
    main()
  File "/usr/bin/google", line 798, in main
    run_once(options, args)
  File "/usr/bin/google", line 577, in run_once
    task.run(client, options, args)
  File "/usr/lib/pymodules/python2.6/googlecl/docs/base.py", line 494, in _run_upload
    file_ext=options.format, convert=options.convert)
  File "/usr/lib/pymodules/python2.6/googlecl/docs/base.py", line 299, in upload_docs
    **kwargs)
  File "/usr/lib/pymodules/python2.6/googlecl/docs/service.py", line 345, in upload_single_doc
    converter=gdata.docs.DocumentListEntryFromString)
  File "/usr/lib/pymodules/python2.6/gdata/service.py", line 1146, in Post
    media_source=media_source, converter=converter)
  File "/usr/lib/pymodules/python2.6/gdata/service.py", line 1214, in PostOrPut
    multipart[2]], headers=extra_headers)
  File "/usr/lib/pymodules/python2.6/atom/service.py", line 175, in request
    data=data, headers=all_headers)
  File "/usr/lib/pymodules/python2.6/gdata/auth.py", line 845, in perform_request
    return http_client.request(operation, url, data=data, headers=headers)
  File "/usr/lib/pymodules/python2.6/atom/http.py", line 135, in request
    connection.endheaders()
  File "/usr/lib/python2.6/httplib.py", line 904, in endheaders
    self._send_output()
  File "/usr/lib/python2.6/httplib.py", line 776, in _send_output
    self.send(msg)
  File "/usr/lib/python2.6/httplib.py", line 755, in send
    self.sock.sendall(str)
  File "/usr/lib/python2.6/ssl.py", line 203, in sendall
    v = self.send(data[count:])
  File "/usr/lib/python2.6/ssl.py", line 94, in <lambda>
    self.send = lambda data, flags=0: SSLSocket.send(self, data, flags)
  File "/usr/lib/python2.6/ssl.py", line 174, in send
    v = self._sslobj.write(data)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf8' in position 
302: ordinal not in range(128)

$ google --version
google 0.9.11

Original comment by tais.han...@gmail.com on 19 Oct 2010 at 12:01

GoogleCodeExporter commented 9 years ago
Well that's obnoxious. But it seems to be fixed by upgrading to gdata 2.0.12. 
Could you try upgrading and see if that works?

After you upgrade to 2.0.12, you'll have to run the command with --force-auth 
to reload the token from Google.

Original comment by tom.h.mi...@gmail.com on 19 Oct 2010 at 4:43

GoogleCodeExporter commented 9 years ago
I found gdata-2.0.8 released for Ubuntu Maverick a few days ago and rebuilt the 
package for Lucid. gdata-2.0.8 solves the issues I experienced.

Original comment by tais.han...@gmail.com on 19 Oct 2010 at 8:49