robocorp / rpaframework

Collection of open-source libraries and tools for Robotic Process Automation (RPA), designed to be used with both Robot Framework and Python
https://www.rpaframework.org/
Apache License 2.0
1.17k stars 225 forks source link

RPA.HTTP support for POST that contain files and data #533

Closed orlof closed 2 years ago

orlof commented 2 years ago

I was trying to convert the following python requests code to RPA.HTTP, but I was not able to combine streaming file upload with "form-data". It could be a missing feature in RPA.HTTP or just the lack of proper example.

import requests
from requests.auth import HTTPBasicAuth

def jobs(filename):
    with open(filename, "rb") as f:
        return requests.post(
            "https://sandbox.zamzar.com/v1/jobs", 
            data={"target_format": "txt"}, 
            files={"source_file": f}, 
            auth=HTTPBasicAuth("ADD_HERE_THE_BASIC_AUTH_STUFF", '')).json()

Here is a link to API docs

mikahanninen commented 2 years ago

This example is from Robot Framework Requests tests.

Have you tried something like this ?

Post Request With Data and File
    [Tags]    post
    &{data}=    Create Dictionary    name=mallikarjunarao    surname=kosuri
    Create File    foobar.txt    content=foobar
    ${file_data}=    Get File    foobar.txt
    &{files}=    Create Dictionary    file=${file_data}
    ${resp}=    Post Request    ${test_session}    /anything    files=${files}    data=${data}
    Should Be Equal As Strings    ${resp.status_code}    200
orlof commented 2 years ago

That syntax creates a request that combines data and files, but it is not streaming (NICE TO HAVE) and request's file part is missing name (MUST) and filename (NICE TO HAVE) attributes.

Here is the beginning of file part by python requests library: Content-Disposition: form-data; name="source_file"; filename="test.txt" ...and here it is with the RF syntax: Content-Disposition: form-data; name="file"; filename="file"

There is probably some way to inject these attributes also with RF syntax, but I am wondering why robotframework-requests library modifies the functionality of python-requests instead of just providing the same functionality.

orlof commented 2 years ago

I expected this to work:

This should work
    @{creds}=  Create List  1234567890123456789012345678901234567890  ${EMPTY}
    Create Session  zamzar  https://sandbox.zamzar.com/v1/  auth=${creds}

    &{data}=  Create Dictionary  target_format=txt

    ${file}=  Get File For Streaming Upload  test.txt
    &{files}=  Create Dictionary  source_file=${file}
    ${resp}=  Post On Session  zamzar  jobs  files=${files}  data=${data}

...but it didn't. In this case robotframework-requests seems to complicate things instead of offering python-requests api for RF syntax.

cmin764 commented 2 years ago

And to add up to the problem, here's how requests recommends on streaming files through POST: POST Multiple Multipart-Encoded Files

But that's required if you need more control over the traditional simple {<file_name>: <file_object>} dictionary.

cmin764 commented 2 years ago

Started investigating it more deeply using this example.


Conclusion

This robot works flawlessly only by using our RPA.HTTP library. (so maybe the scrambled data above was a false-positive; can you @orlof replicate the problem with this robot? (how do I replicate the Content-Disposition diff))

Ran various tests against the library in comparison with how pure requests behaves and they seem identical even with the POST On Session keyword as it's sending the same data & kwargs no matter what you use.

This is the diff in terms of checked states over both the sessions:

Py requests
-----------
>>> url
'https://sandbox.zamzar.com/v1/jobs'
>>> data
{'target_format': 'png'}
>>> json
>>> kwargs
{'files': {'source_file': <_io.BufferedReader name='devdata/portrait.gif'>}}
>>> self.headers
{'User-Agent': 'python-requests/2.27.1', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive'}
>>> self.verify
True
>>> self.proxies
{}
>>> self.params
{}
>>> self.stream
False
>>> self.adapters
OrderedDict([('https://', <requests.adapters.HTTPAdapter object at 0x1077310a0>), ('http://', <requests.adapters.HTTPAdapter object at 0x107711f40>)])

RPA HTTP
--------
<same as above, but:>

>>> kwargs
{'timeout': None, 'cookies': {}, 'files': {'source_file': <_io.BufferedReader name='devdata/portrait.gif'>}}
>>> self.verify
False
cmin764 commented 2 years ago

Closing this, please re-open if there's still an issue with the library.