koenvo / pyodide-http

Provides patches for widely used http libraries to make them work in Pyodide environments like JupyterLite
MIT License
77 stars 14 forks source link

pyodide_http patch_requests breaks COI expectations #40

Open WebReflection opened 11 months ago

WebReflection commented 11 months ago

This was erroneously opened in here https://github.com/pyodide/pyodide/issues/4191

🐛 Bug

While testing/demoing one of our apps in PSDC we noticed that while Chrome/ium was managing to load a 3rd party spreadsheet both Firefox and Safari were completely broken at the headers and permissions headers.

We use code from a worker which requires SharedArrayBuffer and while we managed to enable it, all requests were blocked by the browsers.

To Reproduce

import requests
from typing import Union, Optional

from xlrd import Book
from xlrd.sheet import Sheet

# Sync Calls
from pyodide_http import patch_requests

def extract():
    """ do stuff """

def sync_load(data_url: str, sheet_name: str = None) -> Optional[Union[Book, Sheet]]:
    """"""
    patch_requests()  # patch requests and 

    r = requests.get(data_url)
    if r.status_code != 200:  # Not OK
        return None
    return extract(r.content, sheet_name=sheet_name)

The error in Safari is about headers messed up

[Error] Refused to set unsafe header "Accept-Encoding"
[Error] Refused to set unsafe header "Connection"
[Error] Preflight response is not successful. Status code: 403
[Error] Failed to load resource: Preflight response is not successful. Status code: 403 (sample_workbook.xls, line 0)
[Error] XMLHttpRequest cannot load https://raw.githubusercontent.com/XXX/sample_workbook.xls due to access control checks.
[Error] Failed to load resource: Preflight response is not successful. Status code: 403 (sample_workbook.xls, line 0)

ending up in pyodide as A network error occurred.

Expected behavior

If we change the code to use XHR out of the box everything works without issues and no network warning is ever shown:


def sync_load(data_url: str, sheet_name: str = None) -> Optional[Union[Book, Sheet]]:
    """"""
    xhr = js.XMLHttpRequest.new()
    xhr.open("GET", data_url, False)
    xhr.responseType = "arraybuffer"
    xhr.send(None)
    content = bytes(xhr.response.to_py())
    return extract(content, sheet_name=sheet_name)

I suspect the error is somewhere in here: https://github.com/koenvo/pyodide-http/blob/main/pyodide_http/_core.py#L75

There are a lot of headers manipulation but in some cases browsers really don't like user-land code messing up with security related server defined headers so that override mime type, as example, can be considered insecure as well as anything else that would not otherwise be part already of the predefined headers.

I hence suggest to allow something like patch_requests(ignore_headers=True) so that nothing is changed but I am also not sure why non worker env should change anything at mime type expectations ... although I think that in our case that value is True.

Environment

koenvo commented 10 months ago

Seems to be related to https://github.com/pyodide/pyodide/issues/4068

Can you do:

import pyodide_http
print(pyodide_http.__version__)

If it prints 0.2.0 then the version is broken in firefox/safari due to the user-agent header. This is fixed in pyodide_http 0.2.1

zmoon commented 1 month ago

I am getting many Refused to set unsafe header "Accept-Encoding" and Refused to set unsafe header "Connection" with version 0.2.1 in a Panel app in which I use requests.get(), and then afterwards the app fails to fully load.

koenvo commented 1 month ago

Thanks for reporting this. What browser are you using?

In the end it would be good to have a way to retrieve a list of “unsafe header” names, as those may differ per browser, and probably also over time.

zmoon commented 1 month ago

Thanks for reporting this. What browser are you using?

Tried Chrome, Edge, Firefox, same behavior.

The HTML generated by Panel v1.4.4 loads Pyodide v0.25.0 (and specifies pyodide-http==0.2.1).