osiell / odoorpc

/!\ WARNING /!\ ODOORPC MOVED TO https://github.com/OCA/odoorpc/
https://github.com/OCA/odoorpc/
GNU Lesser General Public License v3.0
72 stars 49 forks source link

Issues sending binary data to the server #23

Closed The-Compiler closed 8 years ago

The-Compiler commented 8 years ago

As described in #20/#21, I'm having some issues importing data into a binary field for the base_import.import model. Unfortunately I'm out of ideas, so I'd really appreciate some help here. I'm still not sure if this is something I do wrong, an OdooRPC bug, or even an odoo server bug.

Test script

I wrote this little test script which tries to import ascii data (which works) and to import non-ascii data in various ways.

All those examples are Python 3, i.e. 'tést' is a python3 str (python2 unicode) and .encode() gives us a python3 bytes.

test 2: utf-8 data as plaintext (bytes)

This tries to send 'id,name\nimport_test.test2,tést'.encode('utf-8') which causes:

Traceback (most recent call last):
  File "encoding_test.py", line 44, in main
    create_id(env, 'id,name\nimport_test.test2,tést'.encode('utf-8'))
  File "encoding_test.py", line 15, in create_id
    import_id = env.create(payload)
  File "/home/florian/proj/odoo/.venv/lib/python3.4/site-packages/odoorpc/models.py", line 72, in rpc_method
    cls._name, method, args, kwargs)
  File "/home/florian/proj/odoo/.venv/lib/python3.4/site-packages/odoorpc/odoo.py", line 468, in execute_kw
    'args': args_to_send})
  File "/home/florian/proj/odoo/.venv/lib/python3.4/site-packages/odoorpc/odoo.py", line 263, in json
    data = self._connector.proxy_json(url, params)
  File "/home/florian/proj/odoo/.venv/lib/python3.4/site-packages/odoorpc/rpc/jsonrpclib.py", line 84, in __call__
    "id": random.randint(0, 1000000000),
  File "/usr/lib/python3.4/json/__init__.py", line 230, in dumps
    return _default_encoder.encode(obj)
  File "/usr/lib/python3.4/json/encoder.py", line 192, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/lib/python3.4/json/encoder.py", line 250, in iterencode
    return _iterencode(o, 0)
  File "/usr/lib/python3.4/json/encoder.py", line 173, in default
    raise TypeError(repr(o) + " is not JSON serializable")
TypeError: b'id,name\nimport_test.test2,t\xc3\xa9st' is not JSON serializable

test 3: utf-8 data as plaintext (string)

This tries to simply send an unicode object (python3 str), which fails on the server:

Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/openerp/http.py", line 537, in _handle_exception
    return super(JsonRequest, self)._handle_exception(exception)
  File "/usr/lib/python2.7/dist-packages/openerp/http.py", line 574, in dispatch
    result = self._call_function(**self.params)
  File "/usr/lib/python2.7/dist-packages/openerp/http.py", line 310, in _call_function
    return checked_call(self.db, *args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/openerp/service/model.py", line 113, in wrapper
    return f(dbname, *args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/openerp/http.py", line 307, in checked_call
    return self.endpoint(*a, **kw)
  File "/usr/lib/python2.7/dist-packages/openerp/http.py", line 803, in __call__
    return self.method(*args, **kw)
  File "/usr/lib/python2.7/dist-packages/openerp/http.py", line 403, in response_wrap
    response = f(*args, **kw)
  File "/usr/lib/python2.7/dist-packages/openerp/http.py", line 1595, in jsonrpc
    return dispatch_rpc(service, method, args)
  File "/usr/lib/python2.7/dist-packages/openerp/http.py", line 115, in dispatch_rpc
    result = dispatch(method, params)
  File "/usr/lib/python2.7/dist-packages/openerp/service/model.py", line 37, in dispatch
    res = fn(db, uid, *params)
  File "/usr/lib/python2.7/dist-packages/openerp/service/model.py", line 162, in execute_kw
    return execute(db, uid, obj, method, *args, **kw or {})
  File "/usr/lib/python2.7/dist-packages/openerp/service/model.py", line 113, in wrapper
    return f(dbname, *args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/openerp/service/model.py", line 170, in execute
    res = execute_cr(cr, uid, obj, method, *args, **kw)
  File "/usr/lib/python2.7/dist-packages/openerp/service/model.py", line 159, in execute_cr
    return getattr(object, method)(cr, uid, *args, **kw)
  File "/usr/lib/python2.7/dist-packages/openerp/api.py", line 241, in wrapper
    return old_api(self, *args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/openerp/api.py", line 336, in old_api
    result = method(recs, *args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/openerp/models.py", line 4077, in create
    record = self.browse(self._create(old_vals))
  File &ouot;/usr/lib/python2.7/dist-packages/openerp/api.py", line 239, in wrapper
    return new_api(self, *args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/openerp/api.py", line 463, in new_api
    result = method(self._model, cr, uid, *args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/openerp/models.py", line 4176, in _create
    updates.append((field, '%s', current_field._symbol_set[1](vals[field])))
  File "/usr/lib/python2.7/dist-packages/openerp/osv/fields.py", line 602, in <lambda>
    _symbol_f = lambda symb: symb and Binary(str(symb)) or None
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 27: ordinal not in range(128)

I investigated the source where the error occurs, where there's this comment:

Binary values may be byte strings (python 2.6 byte array), but the legacy OpenERP convention is to transfer and store binaries as base64-encoded strings. The base64 string may be provided as a unicode in some circumstances, hence the str() cast in symbol_f. This str coercion will only work for pure ASCII unicode strings, on purpose - non base64 data must be passed as a 8bit byte strings.

So I tried if base64 would work:

test 4: utf-8 as encoded base64

This tries to send base64.b64encode('id,name\nimport_test.test4,tést'.encode('utf-8')) which fails on the client, like with test 2:

Traceback (most recent call last):
  [...]
  File "/usr/lib/python3.4/json/encoder.py", line 173, in default
    raise TypeError(repr(o) + " is not JSON serializable")
TypeError: b'aWQsbmFtZQppbXBvcnRfdGVzdC50ZXN0NCx0w6lzdA==' is not JSON serializable

test 5: utf-8 as base64 str

This seems to work, but it looks like the server actually stores the base64 as data in the database, which causes this response when parse_preview is called:

{'error': 'CSV file seems to have no content', 'preview': 'aWQsbmFtZQppbXBvcnRfdGVzdC50ZXN0NSx0w6lzdA=='}

So if I understand correctly, base_import.import doesn't expect its data as base64, but there's no way to transfer non-ASCII data via the json-RPC API? Is there something I'm missing?

I also tried adjusting the script and running it with Python 2, with the same outcome.

sebalix commented 8 years ago

I'm not an expert in encoding/decoding, but I think the line in your second test is wrong (the following code is in Python 2):

>>> 'id,name\nimport_test.test2,tést'.encode('utf-8')
# UnicodeDecodeError

The string is not an unicode one, and you state is it encoded in UTF-8. Either the input string is unicode (u'...'):

>>> u'id,name\nimport_test.test2,tést'.encode('utf-8')
'id,name\nimport_test.test2,t\xc3\xa9st'

Or the input string is an UTF-8 one but in a bytes object, you need to decode it:

>>> 'id,name\nimport_test.test2,tést'.decode('utf-8')
u'id,name\nimport_test.test2,t\xe9st'

EDIT: My bad, didn't see you were using Python 3. I need to do some real tests to check what is wrong.

sebalix commented 8 years ago

Have you test the import through the Web interface, and traces the JSON-RPC requests parameters? If the Web client works, then the bug is in OdooRPC. There are some helper function in the code to decode/encode parameters following the Python version used, maybe a bug is there...

The-Compiler commented 8 years ago

In Python 3, 'foo' is a unicode string (like u'foo' in python 2) and b'foo' is a byte string (like 'foo' in Python 2), so I think that's fine.

It seems like the web interface doesn't actually use the JSON-RPC API to upload the file, but does a multipart/form-data POST to /base_import/set_file with something like this:

------WebKitFormBoundary1k67DLRI4R12FQg3
Content-Disposition: form-data; name="session_id"

d69...37b
------WebKitFormBoundary1k67DLRI4R12FQg3
Content-Disposition: form-data; name="import_id"

80
------WebKitFormBoundary1k67DLRI4R12FQg3
Content-Disposition: form-data; name="file"; filename="product.template.csv"
Content-Type: text/csv

id,default_code
GSD_Artikel_64768,PRT.146D.0000

------WebKitFormBoundary1k67DLRI4R12FQg3
Content-Disposition: form-data; name="jsonp"

import_callback_12
------WebKitFormBoundary1k67DLRI4R12FQg3--

(I haven't tried with a non-ascii char yet, but I guess it'll be very similar)

Then it uses JSON-RPC to call parse_preview.

I'm guessing OdooRPC doesn't have anything which helps me with using that API?

sebalix commented 8 years ago

Normally you should be able to reproduce all requests done by the Web client with the low level methods ODOO.json and ODOO.http (to make the upload with the appropriate HTTP headers). Of course, a solution based on JSON-RPC with base_import.import would be better.

The-Compiler commented 8 years ago

It was harder than I hoped it would be, but I finally have a running prototype:

# encoding: utf-8

import io
import uuid

import odoorpc
import odoorpc.error

import email.generator
import email.mime.multipart
import email.message

def create_id(env):
    payload = {
        'res_model': 'product.product',
    }
    import_id = env.create(payload)
    assert isinstance(import_id, int)
    return import_id

def upload_file(odoo, import_id, data):
    login_data = odoo.json(
        '/web/session/authenticate',
        {'db': 'beh', 'login': odoo._login, 'password': odoo._password}
    )
    session_id = login_data['result']['session_id']

    boundary = '----odoo-import-{}'.format(uuid.uuid4())
    mime_msg = email.mime.multipart.MIMEMultipart(boundary=boundary)

    sess_id_msg = email.message.Message()
    sess_id_msg.set_payload(str(session_id))
    sess_id_msg.add_header('Content-Disposition', 'form-data',
                           name='session_id')
    mime_msg.attach(sess_id_msg)

    import_id_msg = email.message.Message()
    import_id_msg.set_payload(str(import_id))
    import_id_msg.add_header('Content-Disposition', 'form-data',
                             name='import_id')
    mime_msg.attach(import_id_msg)

    file_msg = email.message.Message()
    file_msg.set_payload(data)
    file_msg.add_header('Content-Disposition', 'form-data', name='file',
                        filename='test.csv')
    file_msg.add_header('Content-Type', 'text/csv')
    mime_msg.attach(file_msg)

    outio = io.StringIO()
    generator = email.generator.Generator(outio)
    generator.flatten(mime_msg)

    msg = outio.getvalue()
    msg = '\n\n'.join(msg.split('\n\n')[1:])  # Remove headers

    headers = {
        'Content-Type': 'multipart/form-data; boundary="{}"'.format(boundary),
        'MIME-Version': '1.0',
    }
    odoo.http('base_import/set_file', data=msg.encode('utf-8'),
              headers=headers)

def main():
    odoo = odoorpc.ODOO(...)
    odoo.login(...)
    env = odoo.env['base_import.import']

    options = {
        'quoting': '"',
        'separator': ',',
        'encoding': 'utf-8',
        'headers': True
    }
    data = "id,default_code\nGSD_Artikel_64768,tést"

    import_id = create_id(env)
    upload_file(odoo, import_id, data)

    preview = env.parse_preview(import_id, options=options)
    if 'error' in preview:
        raise Exception(preview)

if __name__ == '__main__':
    main()

Unfortunately I currently don't have the time to integrate this feature into OdooRPC properly, sorry...

But it seems it's an odoo thing after all, as the browser uses the JSON API to get the import ID but supplies the file like this.

Thanks for all your help!

sebalix commented 8 years ago

Thank you for this piece of code. Indeed, if I can't find a way to use the JSON-RPC api, maybe I could add a method to import a CSV file, and your code will help to achieve that for sure.

mistotebe commented 8 years ago

The way to use the API to import data is as follows (however I've yet to successfully load non-ASCII data):

model, filename = ...
fields = [...]
options = {
    'quoting': '"',
    'separator': ',',
    'encoding': 'utf-8',
    'headers': True
}
client.config['timeout'] = 1200 # for large files
bi = client.env['base_import.import']
with open(filename) as f:
    id = bi.create({'res_model': model, 'file_name': filename, 'file_type': 'text/csv', 'file': f.read()})
    bi.do(id, fields=fields, options=options)
The-Compiler commented 8 years ago

@mistotebe using the API that way with non-ASCII data doesn't work, as outlined above. You'll need to post the data as multipart/form-data (like the browser does as well), see my snippet above.

mistotebe commented 8 years ago

Are there any plans on porting this to python-requests? That might make this much easier to implement (one would still need to obtain the csrf token for 9.0 somehow, admin user has access to enough data to recreate it, but noone else).

The-Compiler commented 8 years ago

Not from my side - I had various funny problems at first, so I wanted to be sure I can recreate exactly the multipart/form-data payload the browser sends via JS, byte for byte.

I agree it'd be better to use requests if that works as well, but I'm afraid I don't have the time to do so (and I have a solution which works).

mistotebe commented 8 years ago

Yeah, I've done an import recently where I needed that and doing what you propose above with requests is way easier:

# get sid and csrf from element with class "oe_import" on import page
sid = 
csrf_token = 

r = requests.post(url + 'base_import/set_file',
                  files={'file': (filename, open(filename, 'rb'))},
                  data={'csrf_token': csrf_token, 'import_id': import_id},
                  cookies={'session_id': sid})
KurtHaselwimmer commented 8 years ago

In v9 the base_import/set_file endpoint seems to require a valid csrf token. As a test I have tried copying a csrf token and session_id from a browser upload web-page and have pasted this into a multipart/form-data upload POST submission as shown above. Whatever I try I get a a bad CSRF error - is there a way to avoid the CSRF token requirement, eg by supplying the normal login parameters that you would for any normal xml/json-rpc request, or to get a CSRF token issued with a programmatic call ?

CGenie commented 8 years ago

CSRF Tokens are specifically designed to work with the frontend and not backend RPC calls (it requires a form to be rendered beforehand). Basically XMLRPC is the way to go here with uploading files. Try the openrpc-client-lib for example. When sending the file, mark it as xmlprclib.Binary (https://docs.python.org/2/library/xmlrpclib.html#xmlrpclib.Binary) -- this will handle base64 encoding/decoding transparently. One drawback of XMLRPC here is that the parse_preview method won't work -- it returns a dict whose values are integers which isn't supported by XMLRPC. So for previews, one has to use JSONRPC :)

Sample code:

    xml_odoo = openerplib.get_connection(
        hostname=ODOO_SERVER['hostname'],
        port=ODOO_SERVER['port'],
        database=ODOO_SERVER['database'],
        login=ODOO_SERVER['login'],
        password=ODOO_SERVER['password']
    )

    imp_xml_obj = xml_odoo.get_model('base_import.import')
    imp_obj = odoo.env['base_import.import']

    c = codecs.open(fpath, 'rb', 'utf-8').read().encode('utf-8')

    imp_id = imp_xml_obj.create({
        'res_model': res_model,
        'file': xmlrpclib.Binary(c),
        'file_type': 'text/csv',
        'file_name': file_name,
    })
martintamare commented 7 years ago

I'm trying to upload an ir.attachment. Anyone succeeded in doing that ?

Currently I'm trying to post using http ('/web/binary/upload_attachment') but I'm unable to get it to work properly.

sebalix commented 7 years ago

@martintamare You should be able to do that directly with the create method on ir.attachment.

attachment_model = odoo.env['ir.attachment']
attachment_id = attachment_model.create({'datas': B64_ENCODED_VALUE, ...})
martintamare commented 7 years ago

This was my first guess, but

  File "/usr/local/lib/python3.6/json/encoder.py", line 180, in default
    o.__class__.__name__)
TypeError: Object of type 'bytes' is not JSON serializable

I encoded my data using :

data = some_file_handler.read()
base64.b64encode(data)
martintamare commented 7 years ago

ok fixed, using .encode('utf-8') after. I hate dealing with encoding stuff :)

Thanks for the quick reply !