Open dyanos opened 10 years ago
@hoedic any thoughts here
I am not sure to understand where the MultipartFormdataEncoder
class from the code snippet comes from...
In any case, the code in the _post_multipart
function is the old one, which indeed does not support unicode. @dyanos , can you try to get the code from the last commit (https://github.com/okfn/ckanclient/commit/2cd7096f1f9b5aa859281c899d8d5eda821762b9) hopefully it will work in your case. If not, please post the error trail that you get.
@Hoedic should we pushing a new release of ckanclient with your fixes in?
@Hoedic: I got the "MultipartFormdataEncoder"'s source code at the answer of a question of stackoverflow.(http://stackoverflow.com/questions/1270518/python-standard-library-to-post-multipart-form-data-encoded-data/1270548#1270548)
and I tried using code of last commit and I got the following error message
Traceback (most recent call last):
File "sample.py", line 51, in <module>
main()
File "sample.py", line 40, in main
location, tmp = ckan.upload_file(resource_info['@file'])
File "C:\Users\Sim\Documents\Projects\regdb\regdb\ckanclient\__init__.py", line 584, in upload_file
errcode, body = self._post_multipart(auth_dict['action'].encode('ascii'), fi
elds, files)
File "C:\Users\Sim\Documents\Projects\regdb\regdb\ckanclient\__init__.py", line 490, in _post_multipart
content_type, body = self._encode_multipart_formdata(fields, files)
File "C:\Users\Sim\Documents\Projects\regdb\regdb\ckanclient\__init__.py", line 537, in _encode_multipart_formdata
body = CRLF.join(L)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xec in position 1285: ordinal not in range(128)
@dyanos : did you pull the last master branch? Since the last pull request (15 days ago), the _encode_multipart_form
function has been removed and the _post_multipart
function uses pycurl to build the message: https://github.com/okfn/ckanclient/blob/master/ckanclient/__init__.py#L479
@rgrp : Before pushing a new version, it would be great to have a little more people doing some tests on this code. On top of that, I hope I will be able to do integrate and test the python-requests lib during in the coming around the end of the year.
@Hoedic : I'm sorry that I used wrong branch's code... So, I fixed it, and retried. However I got the following error message...
Traceback (most recent call last):
File "sample.py", line 51, in <module>
main()
File "sample.py", line 40, in main
location, tmp = ckan.upload_file(resource_info['@file'])
File "/home/dyanos/rdf/ckanclient/__init__.py", line 570, in upload_file
errcode, body = self._post_multipart(auth_dict['action'].encode('ascii'), fields, files)
File "/home/dyanos/rdf/ckanclient/__init__.py", line 502, in _post_multipart
c.setopt(c.URL, url)
TypeError: invalid arguments to setopt
I know that this message occurs that 'url' variable has unicode string. So, I changed to non-unicode string...(used str() for testing), I got the following message..
Traceback (most recent call last):
File "sample.py", line 51, in <module>
main()
File "sample.py", line 40, in main
location, tmp = ckan.upload_file(resource_info['@file'])
File "/home/dyanos/rdf/ckanclient/__init__.py", line 570, in upload_file
errcode, body = self._post_multipart(auth_dict['action'].encode('ascii'), fields, files)
File "/home/dyanos/rdf/ckanclient/__init__.py", line 508, in _post_multipart
'Accept-Encoding: identity'
TypeError: list items must be string objects
I'm handling the unicode string in my python source code and documents, and my linux's 'LANG' variable is 'en_US.UTF-8'. Whether these are related?
Well, we are progressing, we have new error message!
My piece of code is forcing the url to ascii encoding (auth_dict['action'].encode('ascii')
) and surprisingly it does not seem to be the issue. However, it really seem that the type of the url is incorrect. Can you try to print the url
value before being used line 502? Or just, does your code is available somewhere so that I can have a look?
Hi, @Hoedic I'm sorry for replying late. the printed result of url is
http://data.datahub.kr/storage/upload_handle
Do you have the actual code calling the CKAN client? I see a sample.py, can I see this code? Or at least know what (type, value) is passed to the upload function: resource_info['@file']
@Hoedic : I uploaded the source code of 'sample.py' here.
import ckanclient
import os,string,sys,json
# http://chanik.egloos.com/3685653
def usage():
print "%s <json file of description of package>" % (sys.argv[0])
print
print "Reference Site: "
print "The format to register a package : <http://docs.ckan.org/en/latest/api.html#ckan.logic.action.create.package_create>"
print "The format to register a package resource : <http://docs.ckan.org/en/latest/api.html#ckan.logic.action.create.resource_create>"
sys.exit(-1)
def filtering(raw_data):
data = {}
for key in filter(lambda x: not x.startswith('@'), raw_data.keys()):
data[key] = raw_data[key]
print data
return data
def main():
configFilename = sys.argv[1]
if not os.path.exists(configFilename):
print "file not exists : %s" % (configFilename)
sys.exit(-1)
data = json.loads(open(configFilename).read())
ckan = ckanclient.CkanClient(base_location=data['endpoint'], api_key=data['api_key'])
for package in data['packages']:
print "-"*80
properties = filtering(package)
ckan.package_register_post(properties)
for resource_info in package['@resource']:
resource_properties = filtering(resource_info)
location = ''
if resource_info.has_key('@file'):
location, tmp = ckan.upload_file(resource_info['@file'])
elif resource_info.has_key('@url'):
location = resource_info['@url']
print location
ckan.add_package_resource(package_name=properties['name'], file_path_or_url='http://data.datahub.kr'+location, **resource_properties)
if __name__ == '__main__':
if len(sys.argv) == 1:
usage()
main()
and resource_info['@file'] has Subway-Line-0523.rdf
, at 199 of init.py, url is http://data.datahub.kr/api/storage/auth/form/2013-12-11T100031/Subway-Line-0523.rdf
.
and this is full error message:
{u'maintainer': u'OKFN Korea', u'name': u'korea-street-name-code2', u'author': u'OKFN Korea', u'author_email': u'okfn.korea@gmail.com', u'notes': u'\ub300\ud55c\ubbfc\uad6d \ub3c4\ub85c\uba85 \ucf54\ub4dc \uc628\ud1a8\ub85c\uc9c0 \ub370\uc774\ud130', u'title': u'\ub300\ud55c\ubbfc\uad6d \ub3c4\ub85c\uba85 \ucf54\ub4dc \ub370\uc774\ud130(test)', u'maintainer_email': u'okfn.korea@gmail.com'}
http://data.datahub.kr/api/rest/package
{u'name': u'\uc804\uccb4 \ub3c4\ub85c\uba85 \ucf54\ub4dc \ub370\uc774\ud130', u'license': u'CC0', u'created': u'2013-11-10 16:00:00', u'format': u'rdf', u'resource_type': u'data', u'description': u'\ub300\ud55c\ubbfc\uad6d \ub3c4\ub85c\uba85 \ucf54\ub4dc \uc628\ud1a8\ub85c\uc9c0 \ub370\uc774\ud130'}
Subway-Line-0523.rdf
http://data.datahub.kr/api/storage/auth/form/2013-12-11T100031/Subway-Line-0523.rdf
http://data.datahub.kr/storage/upload_handle
Traceback (most recent call last):
File "sample.py", line 52, in <module>
main()
File "sample.py", line 41, in main
location, tmp = ckan.upload_file(resource_info['@file'])
File "/home/dyanos/rdf/ckanclient/__init__.py", line 572, in upload_file
errcode, body = self._post_multipart(auth_dict['action'].encode('ascii'), fields, files)
File "/home/dyanos/rdf/ckanclient/__init__.py", line 504, in _post_multipart
c.setopt(c.URL, url)
TypeError: invalid arguments to setopt
Following ckanclient code(at '_encode_multipart_formdata' in init.py), the code making multipart-form body with document having unicode characters can't process, because Python occurs a error during process converting document using UNICODE into ASCII to make multipart-form body.)
To process document included some unicode characters, it need to modify mulitpart-form processing code that is able to handle UNICODE characters.
ckanclient/init.py