encode / httpx

A next generation HTTP client for Python. 🦋
https://www.python-httpx.org/
BSD 3-Clause "New" or "Revised" License
13.09k stars 834 forks source link

Custom JSON library #717

Open victoraugustolls opened 4 years ago

victoraugustolls commented 4 years ago

Hi!

Is there a way to use an alternative JSON library to decode the request? Like orjson for example?

Thanks!

tomchristie commented 4 years ago

You'd need to do that explicitly. I think it'd look like this to encode the request...

httpx.post(headers={'Content-Type': 'application/json'}, data=orjson.dumps(...))

...and like this, to decode the response:

orjson.loads(response.text)
florimondmanca commented 4 years ago

Alternatively, for a more automated solution, you could probably get away with a sys.modules hack? 😅

Here's an example — it uses a wrapper module to add verification-only print statements, but you can skip it and just use sys.modules["json"] = orjson.

# spy_orjson.py
import orjson

def loads(text):
    print("It works! Loading...")
    return orjson.loads(text)

def dumps(text):
    print("It works! Dumping...")
    return orjson.dumps(text)
# main.py
import sys
import spy_orjson

sys.modules["json"] = spy_orjson

import httpx

request = httpx.Request("GET", "https://example.org")
content = b'{"message": "Hello, world"}'
response = httpx.Response(
    200, content=content, headers={"Content-Type": "application/json"}, request=request
)

print(response.json())

Output:

$ python main.py
It works! Loading...
victoraugustolls commented 4 years ago

That's great! Thanks!!

dmig commented 4 years ago

@florimondmanca such an ugly hack...

Why not implement this feature? Looking at orjson and simdjson libraries, this may be used to improve performance a lot.

I'll try to implement this.

tomchristie commented 4 years ago

I'd be okay with us providing an easy way to patch this in, if it follows something similar to how requests allows for this... https://github.com/psf/requests/issues/1595

dmig-alarstudios commented 4 years ago

I'm looking at the code currently, don't see an easy way...

Probably I'll create a httpx.jsonlib with loads and dumps, which may be overridden later. Not the cleanest solution, but will allow to use e.g.:

Kludex commented 1 year ago

There were two PRs closed because they were stale, so I'm going to just reopen this one for us to have a conclusion.

What about adding a new parameter to the client? Something like json_lib? Was this discarded already?

import httpx
import orjson

httpx.Client(json_lib=orjson)
zanieb commented 1 year ago

Maybe it'd be best to be able to specify dumps/loads separately both for user control and to avoid doing getattr to get the dumps/loads methods? Perhaps:

httpx.Client(json_encoder=orjson.dumps, json_decoder=orjson.loads)
islam-aymann commented 1 year ago

Maybe it'd be best to be able to specify dumps/loads separately both for user control and to avoid doing getattr to get the dumps/loads methods? Perhaps:

httpx.Client(json_encoder=orjson.dumps, json_decoder=orjson.loads)

It would be great to use the same names of Pydantic

httpx.Client(json_loads=orjson_loads, json_dumps=orjson_dumps)
rikroe commented 1 year ago

If somebody else comes accross this, to be compatible with mypy and the have correct typing one has to use content instead of data as suggested originally:

httpx.post(headers={'Content-Type': 'application/json'}, content=orjson.dumps(...))
xbeastx commented 1 year ago

3 years later... so even it has pull request to implementing this https://github.com/encode/httpx/pull/1352 why it was closed?

zanieb commented 1 year ago

@xbeastx I think it's quite clearly articulated at https://github.com/encode/httpx/pull/1352#issuecomment-845817581 why that pull request went stale.

There is some additional helpful context at https://github.com/encode/httpx/pull/1730#issuecomment-874011751 and discussion at https://github.com/encode/httpx/discussions/1740

tomchristie commented 1 year ago

I've not been sufficiently happy with any of the API proposal so far, and I've essentially been veto'ing them.

Let me nudge something here that could be viable(?)...

client = httpx.Client(request_class=..., response_class=...)

I can explain why I (potentially) like that if needed. Perhaps the design sense will speak for itself.


Edit 8th Sept 2023:

That API would allow for this kind of customization...

class APIClient(httpx.Client):
    request_class = APIRequest
    response_class = APIResponse

class APIRequest(httpx.Request):
    def __init__(self, *args, **kwargs):
        if 'json' in kwargs:
            content = orjson.dumps(kwargs.pop('json'))
            headers = kwargs.get('headers', {})
            headers['Content-Length'] = len(content)
            kwargs['content'] = content
            kwargs['headers'] = headers
        return super().__init__(*args, **kwargs)

class APIResponse(httpx.Response):
    def json(self):
        return orjson.loads(self.content)
T-256 commented 1 year ago

At here perhaps we need custom models: Headers, Request, Response, Cookies (in _models.py)

# _models.py:
@dataclass
class ClientModels:
    headers: Headers = Headers
    request: Request = Request
    response: Response = Response
    cookies: Cookies = Cookies

DEFAULT_MODELS = ClientModels()

Then we pass our custom models to top level client instance:

class OrjsonResponse(httpx.Response):
    def json(self, **kwargs):
        return orjson.loads(self.text, **kwargs)

models = httpx.ClientModels(response=OrjsonResponse)

with httpx.Client(models=models) as c:
    resp = c.get("https://example.org")
    fast_json = resp.json()
DeadWisdom commented 1 year ago

I'm hitting this issue as I type, and yikes, this is so complicated. 95% of use-cases would be solved if you could just do something like httpx.set_json_handlers(loads=orjson.loads, dumps=orjson.dumps). I'm not doing this on a per Client basis. If I'm using orjson, I'm using orjson everywhere. Also, if I need to be fancy, I can wrap it up in another function.

tomchristie commented 1 year ago

95% of use-cases would be solved if you could just do something like httpx.set_json_handlers(loads=orjson.loads, dumps=orjson.dumps)

I do see that. The issue with that approach is that you introduce subtly different JSON handling at a distance. Installing a new dependancy to your project could end up altering the behaviour of an API client without that being visible anywhere obvious in the project codebase.

I'm not doing this on a per Client basis.

Do you have more than one client instance across the codebase?

dmig commented 1 year ago

Do you have more than one client instance across the codebase?

This is a very normal situation in microservice environment. This is a reason this issue exists.

tomchristie commented 1 year ago

This comment suggests an API that I wouldn't object too.

Once you've added that code you'd be able to use APIClient instead of httpx.Client everywhere throughout the project.

It's not exactly what some of y'all are requesting, but the critical sticking point here is this: I can't see myself doing anything other than veto'ing proposals that use a form of global state.

zanieb commented 1 year ago

I strongly agree that global state is not a good path forward for the library. I like the request_class and response_class approach — that would also help with some other issues like custom wrappers for response errors.

For those who want to configure the JSON library globally in your projects, it'd be trivial to subclass the httpx.Client as described or wrap client retrieval in a helper method.

DeadWisdom commented 1 year ago

Do you have more than one client instance across the codebase?

Well yes, I'm doing with httpx.AsyncClient() as client all the time.

It'd be trivial to subclass the httpx.Client as described or wrap client retrieval in a helper method.

That's probably what I'll do, just wrap the client. It's not immediately obvious that this is what you should do, though. Maybe make it a recipe in the docs? At least until there is a settled solution.

Overall, I'll say this is a classic case of pragmatism vs purity and I'm not sure a convenience function is where you want to spend cycles achieving purity. But that's not for me to say, and I appreciate your hard work and trust you'll make the best decision. Thank you.

illeatmyhat commented 11 months ago

+1. We have datetime.date objects in our JSON, and while there's nothing wrong with writing a JSON Encoder, it seems not very ideal to have to do client.get(..., data=json.dumps(..., cls=MyEncoder)) every single time.

chbndrhnns commented 7 months ago

My use case is sending pydantic models from my test to a webapp using the httpx client. I am currently doing a roundtrip conversion for each params or json argument to get rid of custom types.

dmig commented 7 months ago

@chbndrhnns well, your case seems to be simple: https://docs.pydantic.dev/latest/concepts/serialization/#modelmodel_dump_json -- just overload this (or model_dump depending on your needs)

chbndrhnns commented 7 months ago

your case seems to be simple:

Ok, let's take this as a simplified example for my use case:

import httpx
from pydantic import BaseModel

def test():
    class Address(BaseModel):
        zip: str
        street: str
        city: str

    payload = {
        "name": "me",
        "address": Address(zip="0000", street="this street", city="my city")
    }
    _ = httpx.post("http://127.0.0.1:8000/", json=payload)

It fails unless I call model_dump_json() on each value which is not a stdlib type

E       TypeError: Object of type Address is not JSON serializable
illeatmyhat commented 7 months ago

It sounds like you don't own this web app, but normally you should define the payload in pydantic as well and call model_dump() on the root. Easiest way to solve the problem IMO. Then if you want some mental gymnastics, you can override the httpx function to call model_dump() on pydantic models, but that may be a step too far for some maintainers.

If that sounds tedious to you, consider that strong typing is generally a compromise of being tedious in exchange for being correct Pydantic 2 also introduced model_serializer and field_serializer so you don't have to override JSONEncoder

DeoLeung commented 7 months ago

I've not been sufficiently happy with any of the API proposal so far, and I've essentially been veto'ing them.

Let me nudge something here that could be viable(?)...

client = httpx.Client(request_class=..., response_class=...)

I can explain why I (potentially) like that if needed. Perhaps the design sense will speak for itself.

Edit 8th Sept 2023:

That API would allow for this kind of customization...

class APIClient(httpx.Client):
    request_class = APIRequest
    response_class = APIResponse

class APIRequest(httpx.Request):
    def __init__(self, *args, **kwargs):
        if 'json' in kwargs:
            content = orjson.dumps(kwargs.pop('json'))
            headers = kwargs.get('headers', {})
            headers['Content-Length'] = len(content)
            kwargs['content'] = content
            kwargs['headers'] = headers
        return super().__init__(*args, **kwargs)

class APIResponse(httpx.Response):
    def json(self):
        return orjson.loads(self.content)

having the ability to customize the response class will be great, any schedule on its implementation? :)

seandstewart commented 5 months ago

One consideration that I haven't seen proposed - it's entirely reasonable for this library to check if the value of json is already encoded to bytes. In that case, you can skip calling json.dumps here: https://github.com/encode/httpx/blob/7354ed70ceb1a0f072af82e2cb784ef6b2512ed3/httpx/_content.py#L176-L181

so it could look something like:

 def encode_json(json: Any) -> tuple[dict[str, str], ByteStream]:
    body = json if isinstance(json, bytes) else json_dumps(json).encode("utf-8")
    content_length = str(len(body))
    content_type = "application/json"
    headers = {"Content-Length": content_length, "Content-Type": content_type}
    return headers, ByteStream(body)

This would allow developers the ability to handle json encoding and decoding external to the library, so we could do something like this:

import httpx
import orjson

mydata = {...}
with httpx.ClientSession() as client:
    encoded = orjson.dumps(mydata)
    response = client.post("https://fake.url/data/", json=encoded)
    result = orjson.loads(response.content)

I know for a fact aiohttp does something similar, so there is precedent here. (It also allows you to pass in a json encoder and decoder, but we've been down that road here.)

If this seems like a reasonable change, I'm happy to make the requisite PR.

gtors commented 5 months ago

Any news? Why not just borrow a solution from aiohttp?

aiohttp.ClientSession(json_serialize=..., ...)
gtors commented 5 months ago

🌟 Introducing HTTPJ! 🚀 It's like HTTPX, but with built-in support for flexible JSON serialization/deserialization!

pip install httpj orjson
import datetime
import pprint

import httpj
import orjson

resp = httpj.post(
    "https://postman-echo.com/post",
    json={"dt": datetime.datetime.utcnow()},
    json_serialize=lambda j: orjson.dumps(j, option=orjson.OPT_NAIVE_UTC),  # optional
    json_deserialize=orjson.loads,  # optional
)
pprint.pprint(resp.json(), indent=4)

p.s.: I'm tired of waiting for this feature for more than 4 years...

seandstewart commented 5 months ago

I added the following snippet to my client module to allow for passing in bytes as json. Not a huge fan of monkey-patching, but it get the job done.

def _patch_httpx():  # type: ignore
    """Monkey-patch httpx so that we can use our own json ser/des.

    https://github.com/encode/httpx/issues/717
    """
    from httpx._content import Any, ByteStream, json_dumps

    def encode_json(json: Any) -> tuple[dict[str, str], ByteStream]:
        body = json if isinstance(json, bytes) else json_dumps(json).encode("utf-8")
        content_length = str(len(body))
        content_type = "application/json"
        headers = {"Content-Length": content_length, "Content-Type": content_type}
        return headers, ByteStream(body)

    # This makes the above function look and act like the original.
    encode_json.__globals__.update(httpx._content.__dict__)
    encode_json.__module__ = httpx._content.__name__
    httpx._content.encode_json = encode_json

_patch_httpx()
tomchristie commented 5 months ago

I'm tired of waiting for this feature for more than 4 years...

So, here's an API proposal.

Yep, I'll happily help someone get a pull request merged against that proposal.

q0w commented 4 months ago

Does it mean that httpx.Client should be now Generic[RequestType, ResponseType] ?

tomchristie commented 4 months ago

Does it mean that httpx.Client should be now Generic[RequestType, ResponseType] ?

Please no. 😬

paulofreitas commented 3 months ago

One consideration that I haven't seen proposed - it's entirely reasonable for this library to check if the value of json is already encoded to bytes. In that case, you can skip calling json.dumps here: [...]

Custom request & response classes would be great but this trivial change change would also be useful. 👍