box / box-python-sdk

Box SDK for Python
http://opensource.box.com/box-python-sdk/
Apache License 2.0
419 stars 215 forks source link

update_contents_with_stream results in a CORS error in an environment that requires CORS Domains settings. #856

Closed tanaga9 closed 11 months ago

tanaga9 commented 11 months ago

Description of the Issue

Part of traceback

File /lib/python3.11/site-packages/boxsdk/util/api_call_decorator.py:63, in APICallWrapper.__get__.<locals>.call(instance, *args, **kwargs)
     60     instance = instance.clone(instance.session.with_default_network_request_kwargs(extra_network_parameters))
     62 method = self._func_that_makes_an_api_call.__get__(instance, owner)
---> 63 return method(*args, **kwargs)

File /lib/python3.11/site-packages/boxsdk/object/file.py:261, in File.update_contents_with_stream(self, file_stream, etag, preflight_check, preflight_expected_size, upload_using_accelerator, file_name, content_modified_at, additional_attributes, sha1)
    259 if not headers:
    260     headers = None
--> 261 file_response = self._session.post(
    262     url,
    263     expect_json_response=False,
    264     data=data,
    265     files=files,
    266     headers=headers,
    267 ).json()
    268 if 'entries' in file_response:
    269     file_response = file_response['entries'][0]

File /lib/python3.11/site-packages/boxsdk/session/session.py:100, in Session.post(self, url, **kwargs)
     94 def post(self, url: str, **kwargs: Any) -> '_BoxResponse':
     95     """Make a POST request to the Box API.
     96 
     97     :param url:
     98         The URL for the request.
     99     """
--> 100     return self.request('POST', url, **kwargs)

File /lib/python3.11/site-packages/boxsdk/session/session.py:138, in Session.request(self, method, url, **kwargs)
    130 def request(self, method: str, url: str, **kwargs: Any) -> '_BoxResponse':
    131     """Make a request to the Box API.
    132 
    133     :param method:
   (...)
    136         The URL for the request.
    137     """
--> 138     response = self._prepare_and_send_request(method, url, **kwargs)
    139     return self.box_response_constructor(response)

File /lib/python3.11/site-packages/boxsdk/session/session.py:348, in Session._prepare_and_send_request(self, method, url, headers, auto_session_renewal, expect_json_response, **kwargs)
    346 raised_exception = None
    347 try:
--> 348     network_response = self._send_request(request, **kwargs)
    349     reauthentication_needed = network_response.status_code == 401
    350 except RequestException as request_exc:

File /lib/python3.11/site-packages/boxsdk/session/session.py:585, in AuthorizedSession._send_request(self, request, **kwargs)
    583 request.headers.update(authorization_header)
    584 kwargs['access_token'] = access_token
--> 585 return super()._send_request(request, **kwargs)

File /lib/python3.11/site-packages/boxsdk/session/session.py:488, in Session._send_request(self, request, **kwargs)
    485 request.access_token = request_kwargs.pop('access_token', None)
    487 # send the request
--> 488 network_response = self._network_layer.request(
    489     request.method,
    490     request.url,
    491     access_token=request.access_token,
    492     headers=request.headers,
    493     log_response_content=request.expect_json_response,
    494     **request_kwargs
    495 )
    497 return network_response

File /lib/python3.11/site-packages/boxsdk/network/default_network.py:44, in DefaultNetwork.request(self, method, url, access_token, **kwargs)
     41 # pylint:disable=abstract-class-instantiated
     42 try:
     43     return self.network_response_constructor(
---> 44         request_response=self._session.request(method, url, **kwargs),
     45         access_token_used=access_token,
     46         log_response_content=log_response_content
     47     )
     48 except Exception:
     49     self._log_exception(method, url, sys.exc_info())

Steps to Reproduce

How to use boxsdk in Pyodide (JupyterLite) noname

Expected Behavior

Error Message, Including Stack Trace

Access to XMLHttpRequest at 'xxx' has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present on the requested resource.

Screenshots

Versions Used

Python SDK: 3.9.2 Python: 3.11.2 (Pyodide)

mwwoda commented 11 months ago

Hi @tanaga9,

This problem certainly looks strange because you can access some endpoints, but CORS blocks file transfers. I would need few more things from you to investigate it further.

I had some problems reproducing this issue with the steps provided. Could you provide a small reproducible sample with the BoxSDK code, preferably as a Jupyter notebook/cell, that can be run from a browser? Could also post the entire stack trace of the error you're getting (I think the error itself is missing in the stack trace you posted)?

Have you tried with other Python distributions than Pyodide? Or try to reproduce this problem in different environment than Jupyter? Does the same problem occur? Any additional information would help.

Also please make sure that CORS is configured correctly on both Box and your app.

tanaga9 commented 11 months ago

@mwwoda Thank you for your reply.

To begin with, CORS errors only occur in very limited execution environments. As a boxsdk, it is also important for us to decide whether or not to support those environments.

Conditions for CORS errors in boxsdk

The steps to reproduce the issue are as follows You can get OAuth 2.0 Credentials and Developer Token from the developer console. (The Developer Token must be generated after the CORS domain settings have been correctly configured)

In an environment where you can use Pyodide, such as jupyterlite.

oauth2_client_id = ""
oauth2_client_secret = ""
developer_token = ""
file_id = "" # An existing file_ID, regardless of its contents.

# ----------------------------------------

import micropip
await micropip.install(["pyodide-http", "boxsdk", "requests"])

import pyodide_http
pyodide_http.patch_all()

import boxsdk
import io

auth = boxsdk.OAuth2(
    client_id=oauth2_client_id,
    client_secret=oauth2_client_secret,
    access_token=developer_token,
)
client = boxsdk.Client(auth)

user = client.user().get() # success (*1)
print(user)

file = client.file(file_id).get() # success
print(file)

client.file(file_id).update_contents_with_stream(io.BytesIO(b"dummy data")) # NetworkError (*2)

*1: If the CORS domain settings are not correct, The following errors will occur. (This is not a issue)

BoxAPIException: Message: Access denied - Did you forget to whitelist your origin in the CORS config of your app?
Status: 403
Code: cors_origin_not_whitelisted
Request ID: xxxxx
Headers: {'cache-control': 'no-cache, no-store', 'content-type': 'application/json'}
URL: https://api.box.com/2.0/users/me
Method: GET
Context Info: {'origin': 'https://jupyterlite.readthedocs.io'}

*2: However, the CORS error for this case is as follows.

Request "POST https://upload.box.com/api/2.0/files/xxxxx/content" failed with JsException exception: NetworkError: Failed to execute 'send' on 'XMLHttpRequest': Failed to load 'https://upload.box.com/api/2.0/files/xxxxx/content'.
---------------------------------------------------------------------------
JsException                               Traceback (most recent call last)
Cell In[5], line 28
     25 file = client.file(file_id).get() # success
     26 print(file)
---> 28 client.file(file_id).update_contents_with_stream(io.BytesIO(b"dummy data"))

File /lib/python3.11/site-packages/boxsdk/util/api_call_decorator.py:63, in APICallWrapper.__get__.<locals>.call(instance, *args, **kwargs)
     60     instance = instance.clone(instance.session.with_default_network_request_kwargs(extra_network_parameters))
     62 method = self._func_that_makes_an_api_call.__get__(instance, owner)
---> 63 return method(*args, **kwargs)

File /lib/python3.11/site-packages/boxsdk/object/file.py:261, in File.update_contents_with_stream(self, file_stream, etag, preflight_check, preflight_expected_size, upload_using_accelerator, file_name, content_modified_at, additional_attributes, sha1)
    259 if not headers:
    260     headers = None
--> 261 file_response = self._session.post(
    262     url,
    263     expect_json_response=False,
    264     data=data,
    265     files=files,
    266     headers=headers,
    267 ).json()
    268 if 'entries' in file_response:
    269     file_response = file_response['entries'][0]

File /lib/python3.11/site-packages/boxsdk/session/session.py:100, in Session.post(self, url, **kwargs)
     94 def post(self, url: str, **kwargs: Any) -> '_BoxResponse':
     95     """Make a POST request to the Box API.
     96 
     97     :param url:
     98         The URL for the request.
     99     """
--> 100     return self.request('POST', url, **kwargs)

File /lib/python3.11/site-packages/boxsdk/session/session.py:138, in Session.request(self, method, url, **kwargs)
    130 def request(self, method: str, url: str, **kwargs: Any) -> '_BoxResponse':
    131     """Make a request to the Box API.
    132 
    133     :param method:
   (...)
    136         The URL for the request.
    137     """
--> 138     response = self._prepare_and_send_request(method, url, **kwargs)
    139     return self.box_response_constructor(response)

File /lib/python3.11/site-packages/boxsdk/session/session.py:348, in Session._prepare_and_send_request(self, method, url, headers, auto_session_renewal, expect_json_response, **kwargs)
    346 raised_exception = None
    347 try:
--> 348     network_response = self._send_request(request, **kwargs)
    349     reauthentication_needed = network_response.status_code == 401
    350 except RequestException as request_exc:

File /lib/python3.11/site-packages/boxsdk/session/session.py:585, in AuthorizedSession._send_request(self, request, **kwargs)
    583 request.headers.update(authorization_header)
    584 kwargs['access_token'] = access_token
--> 585 return super()._send_request(request, **kwargs)

File /lib/python3.11/site-packages/boxsdk/session/session.py:488, in Session._send_request(self, request, **kwargs)
    485 request.access_token = request_kwargs.pop('access_token', None)
    487 # send the request
--> 488 network_response = self._network_layer.request(
    489     request.method,
    490     request.url,
    491     access_token=request.access_token,
    492     headers=request.headers,
    493     log_response_content=request.expect_json_response,
    494     **request_kwargs
    495 )
    497 return network_response

File /lib/python3.11/site-packages/boxsdk/network/default_network.py:44, in DefaultNetwork.request(self, method, url, access_token, **kwargs)
     41 # pylint:disable=abstract-class-instantiated
     42 try:
     43     return self.network_response_constructor(
---> 44         request_response=self._session.request(method, url, **kwargs),
     45         access_token_used=access_token,
     46         log_response_content=log_response_content
     47     )
     48 except Exception:
     49     self._log_exception(method, url, sys.exc_info())

File /lib/python3.11/site-packages/requests/sessions.py:589, in Session.request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json)
    584 send_kwargs = {
    585     "timeout": timeout,
    586     "allow_redirects": allow_redirects,
    587 }
    588 send_kwargs.update(settings)
--> 589 resp = self.send(prep, **send_kwargs)
    591 return resp

File /lib/python3.11/site-packages/requests/sessions.py:703, in Session.send(self, request, **kwargs)
    700 start = preferred_clock()
    702 # Send the request
--> 703 r = adapter.send(request, **kwargs)
    705 # Total elapsed time of the request (approximately)
    706 elapsed = preferred_clock() - start

File /lib/python3.11/site-packages/pyodide_http/_requests.py:42, in PyodideHTTPAdapter.send(self, request, **kwargs)
     40     pyodide_request.set_body(request.body)
     41 try:
---> 42     resp = send(pyodide_request, stream)
     43 except _StreamingTimeout:
     44     from requests import ConnectTimeout

File /lib/python3.11/site-packages/pyodide_http/_core.py:113, in send(request, stream)
    110 for name, value in request.headers.items():
    111     xhr.setRequestHeader(name, value)
--> 113 xhr.send(to_js(request.body))
    115 headers = dict(Parser().parsestr(xhr.getAllResponseHeaders()))
    117 if _IN_WORKER:

JsException: NetworkError: Failed to execute 'send' on 'XMLHttpRequest': Failed to load 'https://upload.box.com/api/2.0/files/xxxxx/content'.

it will succeed if you use requests.

import requests
requests.post(
    f'https://upload.box.com/api/2.0/files/{file_id}/content',
    headers={'Authorization': f'Bearer {developer_token}'},
    files={'uploadFile': (file.name, io.BytesIO(b"dummy data"))}
) # success <Response [201]>

I have something I want to try. I want to change the URL of api_config, send a request to my API server, and check how the contents of the header and body are different from others. However, I couldn't figure out how to change the settings.

congminh1254 commented 11 months ago

Hi @tanaga9,

Did you add your domain to the CORS domain in dev portal as follow:

image

Remember to include protocol, domain, port and without slash at the end of url.

Best, Minh

tanaga9 commented 11 months ago

Hi @congminh1254

Did you add your domain to the CORS domain in dev portal as follow:

sure. All other API calls have been successful.

sc_20231030_223904

By the way, I am using the jupyterlab_box_drive demonstration page.

Demonstration

  • Box Dev Console
  • Create New App
  • Create a Custom App (OAuth 2.0) OAuth 2.0 (User or Client Authentication)
  • Configuration
    • get OAuth 2.0 Credentials
    • set OAuth 2.0 Redirect URI
      • example: https://tanaga9.github.io/jupyterlab-box-drive/extensions/jupyterlab-box-drive/static/assets/auth.html
    • set Application Scopes
      • Read all files and folders stored in Box
      • Write all files and folders stored in Box
    • set CORS Domains
      • example: https://tanaga9.github.io
congminh1254 commented 11 months ago

Hi @tanaga9

Screenshot 2023-10-30 at 15 37 30

I tried the same command with Jupyter Notebook (not jupyterlite) and it's working well from my side.

I will let you know if I have any updates.

Best, Minh

congminh1254 commented 11 months ago

Hi @congminh1254

Did you add your domain to the CORS domain in dev portal as follow:

sure. All other API calls have been successful.

sc_20231030_223904

Please send us the full exception stack trace, it can help to identify the issue much easier.

tanaga9 commented 11 months ago

Please send us the full exception stack trace, it can help to identify the issue much easier

full exception stack trace is here full exception stack trace.txt

I tried the same command with Jupyter Notebook (not jupyterlite) and it's working well from my side.

To begin with, CORS errors only occur in very limited execution environments. Conditions for CORS errors in boxsdk

In this context, "Web browser-based Python" does not simply mean that the UI is a web page.

If you use JupyterLab, the kernel is actually on the server side, so CORS errors will never occur.

As a boxsdk, it is also important for us to decide whether or not to support those environments.

tanaga9 commented 11 months ago

There are two main types of Python implementations that run on a web browser:

Those that have a kernel running on the server side

In this type of implementation, Python code is not executed directly in the web browser, but is rendered from a kernel running on the server side. This means that CORS errors and other problems are not possible.

Some examples of this type of implementation include:

Those that run Python code directly in the web browser

In this type of implementation, Python code is executed directly in the web browser. This means that CORS errors and other problems are possible.

Some examples of this type of implementation include:

congminh1254 commented 11 months ago

Thanks for your log @tanaga9, I will check it.

By the time, you can also take a look at box/box-python-sdk-gen (still in beta) to check if it's working.

tanaga9 commented 11 months ago

Thanks @congminh1254

Similarly, box_sdk_gen succeeded with other APIs, but file writes resulted in a CORS Error.

sc_20231031_011554

traceback

---------------------------------------------------------------------------
JsException                               Traceback (most recent call last)
Cell In[2], line 5
      1 attrs = UploadFileAttributesArg(
      2     name=file.name,
      3     parent=UploadFileAttributesArgParentField(id='0')
      4 )
----> 5 files = client.uploads.upload_file(attributes=attrs, file=io.BytesIO(b"dummy data"))
      6 file = files.entries[0]
      7 print(f'File uploaded with id {file.id}, name {file.name}')

File /lib/python3.11/site-packages/box_sdk_gen/managers/uploads.py:318, in UploadsManager.upload_file(self, attributes, file, file_file_name, file_content_type, fields, content_md_5, extra_headers)
    314 query_params_map: Dict[str, str] = prepare_params({'fields': to_string(fields)})
    315 headers_map: Dict[str, str] = prepare_params(
    316     {'content-md5': to_string(content_md_5), **extra_headers}
    317 )
--> 318 response: FetchResponse = fetch(
    319     ''.join(['https://upload.box.com/api/2.0/files/content']),
    320     FetchOptions(
    321         method='POST',
    322         params=query_params_map,
    323         headers=headers_map,
    324         multipart_data=[
    325             MultipartItem(
    326                 part_name='attributes',
    327                 body=serialize(request_body['attributes']),
    328             ),
    329             MultipartItem(
    330                 part_name='file',
    331                 file_stream=request_body['file'],
    332                 file_name=request_body['file_file_name'],
    333                 content_type=request_body['file_content_type'],
    334             ),
    335         ],
    336         content_type='multipart/form-data',
    337         response_format='json',
    338         auth=self.auth,
    339         network_session=self.network_session,
    340     ),
    341 )
    342 return deserialize(response.text, Files)

File /lib/python3.11/site-packages/box_sdk_gen/fetch.py:99, in fetch(url, options)
     96 params = options.params or {}
     98 attempt_nr = 1
---> 99 response: APIResponse = __make_request(
    100     session=requests_session,
    101     method=options.method,
    102     url=url,
    103     headers=headers,
    104     body=options.file_stream or options.body,
    105     content_type=options.content_type,
    106     params=params,
    107     multipart_data=options.multipart_data,
    108     attempt_nr=attempt_nr,
    109 )
    111 while attempt_nr < max_attempts:
    112     if response.ok:

File /lib/python3.11/site-packages/box_sdk_gen/fetch.py:212, in __make_request(session, method, url, headers, body, content_type, params, multipart_data, attempt_nr)
    210 raised_exception = None
    211 try:
--> 212     network_response = session.request(
    213         method=method,
    214         url=url,
    215         headers=headers,
    216         data=body,
    217         params=params,
    218         stream=True,
    219     )
    220     reauthentication_needed = network_response.status_code == 401
    221 except RequestException as request_exc:

File /lib/python3.11/site-packages/requests/sessions.py:589, in Session.request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json)
    584 send_kwargs = {
    585     "timeout": timeout,
    586     "allow_redirects": allow_redirects,
    587 }
    588 send_kwargs.update(settings)
--> 589 resp = self.send(prep, **send_kwargs)
    591 return resp

File /lib/python3.11/site-packages/requests/sessions.py:703, in Session.send(self, request, **kwargs)
    700 start = preferred_clock()
    702 # Send the request
--> 703 r = adapter.send(request, **kwargs)
    705 # Total elapsed time of the request (approximately)
    706 elapsed = preferred_clock() - start

File /lib/python3.11/site-packages/pyodide_http/_requests.py:42, in PyodideHTTPAdapter.send(self, request, **kwargs)
     40     pyodide_request.set_body(request.body)
     41 try:
---> 42     resp = send(pyodide_request, stream)
     43 except _StreamingTimeout:
     44     from requests import ConnectTimeout

File /lib/python3.11/site-packages/pyodide_http/_core.py:113, in send(request, stream)
    110 for name, value in request.headers.items():
    111     xhr.setRequestHeader(name, value)
--> 113 xhr.send(to_js(request.body))
    115 headers = dict(Parser().parsestr(xhr.getAllResponseHeaders()))
    117 if _IN_WORKER:

JsException: NetworkError: Failed to execute 'send' on 'XMLHttpRequest': Failed to load 'https://upload.box.com/api/2.0/files/content'.

It is possible that this is a problem with pyodide_http.

congminh1254 commented 11 months ago

I think it could be this issue koenvo/pyodide-http#38

Here is the request body I captured from the Box SDK when pyodide patching all requests:

<MultipartEncoder: OrderedDict([('attributes', '{"name": null, "content_modified_at": null}'), ('file', ('unused', <_io.BytesIO object at 0x2ba4e80>))])>
tanaga9 commented 11 months ago

I thought about that too, but at least I was able to successfully upload files using the following method.

There may be some complex conditions that need to be met.

I'll also ask on koenvo/pyodide-http#38.

import requests
requests.post(
    f'https://upload.box.com/api/2.0/files/{file_id}/content',
    headers={'Authorization': f'Bearer {developer_token}'},
    files={'uploadFile': (file.name, io.BytesIO(b"dummy data"))}
) # success <Response [201]>
congminh1254 commented 11 months ago

Hi @tanaga9

As you can see from my comment above, the body send by the SDK is wrapped in MultipartEncoder and maybe the pyodide-http library are not supporting this (yet).

So I think this is not an issue with the SDK, but of the pyodide-http library as it still in beta version.

I will close this issue for now, if you have any other update or issue, feel free to open it again.

Best, Minh