encode / starlette

The little ASGI framework that shines. 🌟
https://www.starlette.io/
BSD 3-Clause "New" or "Revised" License
10.31k stars 948 forks source link

Mismatch Between ASGI Specification and httpx's `raw_path` Definition in TestClient #2692

Closed raptium closed 1 week ago

raptium commented 2 months ago

According to the ASGI specification 1, the raw_path field is defined as:

raw_path (byte string) – The original HTTP path component, excluding any query string, unmodified from the bytes that were received by the web server.

In Starlette's TestClient, the raw_path is directly taken from the httpx request without modification:

https://github.com/encode/starlette/blob/2d0dde8defe9e3d4df0baacdb20faf273152f1cb/starlette/testclient.py#L307 https://github.com/encode/starlette/blob/2d0dde8defe9e3d4df0baacdb20faf273152f1cb/starlette/testclient.py#L254

However, httpx's definition of raw_path includes the query string, which differs from the ASGI specification. This discrepancy was previously noted in a related issue on httpx 2.

Proposed Solution:

To align with the ASGI specification, the raw_path in Starlette's TestClient should be adjusted to exclude the query string before being passed to the ASGI application.

References:

Kludex commented 2 months ago

Do you have an MRE?

raptium commented 2 months ago

I've added a test case in test_testclient.py to reproduce this issue. Currently it fails because query string is included in raw_path.

def test_raw_path_with_querystring(test_client_factory: TestClientFactory) -> None:
    async def app(scope: Scope, receive: Receive, send: Send) -> None:
        raw_path = scope.get("raw_path")
        assert raw_path is not None
        response = PlainTextResponse(f"raw_path: {raw_path}")
        await response(scope, receive, send)

    client = test_client_factory(app)
    response = client.get("/hello world", params={"foo": "bar"})
    assert response.text == "raw_path: b'/hello%20world'"

This issue specifically affects TestClient and does not impact the Starlette server itself.

I'm planning to submit a PR with a one-line fix along with the above test case.

Additionally, it's worth noting that the raw_path for the WebSocket protocol in the ASGI specification 1 appears to differ slightly from that of the HTTP protocol. The spec does not explicitly mention whether the query string should be included in raw_path.

Given this ambiguity, I've left raw_path unchanged for the WebSocket case.

Kludex commented 2 months ago

Can you create an issue in the asgiref repository about this divergence?

raptium commented 2 months ago

https://github.com/django/asgiref/issues/468 is created about the divergence between HTTP and WebSocket. I guess we can wait until this gets clarified before making changes to the TestClient.

Kludex commented 2 months ago

I saw it. Thanks.

Kludex commented 2 months ago

Do you know if uvicorn needs to be fixed as well?

Also, PR is welcome.

raptium commented 1 month ago

PR submitted. Uvicorn is good and no fix is needed.