Open victoraugustolls opened 4 years ago
You'd need to do that explicitly. I think it'd look like this to encode the request...
httpx.post(headers={'Content-Type': 'application/json'}, data=orjson.dumps(...))
...and like this, to decode the response:
orjson.loads(response.text)
Alternatively, for a more automated solution, you could probably get away with a sys.modules
hack? 😅
Here's an example — it uses a wrapper module to add verification-only print statements, but you can skip it and just use sys.modules["json"] = orjson
.
# spy_orjson.py
import orjson
def loads(text):
print("It works! Loading...")
return orjson.loads(text)
def dumps(text):
print("It works! Dumping...")
return orjson.dumps(text)
# main.py
import sys
import spy_orjson
sys.modules["json"] = spy_orjson
import httpx
request = httpx.Request("GET", "https://example.org")
content = b'{"message": "Hello, world"}'
response = httpx.Response(
200, content=content, headers={"Content-Type": "application/json"}, request=request
)
print(response.json())
Output:
$ python main.py
It works! Loading...
That's great! Thanks!!
@florimondmanca such an ugly hack...
Why not implement this feature? Looking at orjson
and simdjson
libraries, this may be used to improve performance a lot.
I'll try to implement this.
I'd be okay with us providing an easy way to patch this in, if it follows something similar to how requests
allows for this... https://github.com/psf/requests/issues/1595
I'm looking at the code currently, don't see an easy way...
Probably I'll create a httpx.jsonlib
with loads
and dumps
, which may be overridden later. Not the cleanest solution, but will allow to use e.g.:
orjson
for loads
and dumps
(which returns bytes
) simdjson
for loads
and orjson
for dumps
There were two PRs closed because they were stale, so I'm going to just reopen this one for us to have a conclusion.
What about adding a new parameter to the client? Something like json_lib
? Was this discarded already?
import httpx
import orjson
httpx.Client(json_lib=orjson)
Maybe it'd be best to be able to specify dumps/loads separately both for user control and to avoid doing getattr
to get the dumps/loads
methods? Perhaps:
httpx.Client(json_encoder=orjson.dumps, json_decoder=orjson.loads)
Maybe it'd be best to be able to specify dumps/loads separately both for user control and to avoid doing
getattr
to get thedumps/loads
methods? Perhaps:httpx.Client(json_encoder=orjson.dumps, json_decoder=orjson.loads)
It would be great to use the same names of Pydantic
httpx.Client(json_loads=orjson_loads, json_dumps=orjson_dumps)
If somebody else comes accross this, to be compatible with mypy and the have correct typing one has to use content
instead of data
as suggested originally:
httpx.post(headers={'Content-Type': 'application/json'}, content=orjson.dumps(...))
3 years later... so even it has pull request to implementing this https://github.com/encode/httpx/pull/1352 why it was closed?
@xbeastx I think it's quite clearly articulated at https://github.com/encode/httpx/pull/1352#issuecomment-845817581 why that pull request went stale.
There is some additional helpful context at https://github.com/encode/httpx/pull/1730#issuecomment-874011751 and discussion at https://github.com/encode/httpx/discussions/1740
I've not been sufficiently happy with any of the API proposal so far, and I've essentially been veto'ing them.
Let me nudge something here that could be viable(?)...
client = httpx.Client(request_class=..., response_class=...)
I can explain why I (potentially) like that if needed. Perhaps the design sense will speak for itself.
Edit 8th Sept 2023:
That API would allow for this kind of customization...
class APIClient(httpx.Client):
request_class = APIRequest
response_class = APIResponse
class APIRequest(httpx.Request):
def __init__(self, *args, **kwargs):
if 'json' in kwargs:
content = orjson.dumps(kwargs.pop('json'))
headers = kwargs.get('headers', {})
headers['Content-Length'] = len(content)
kwargs['content'] = content
kwargs['headers'] = headers
return super().__init__(*args, **kwargs)
class APIResponse(httpx.Response):
def json(self):
return orjson.loads(self.content)
At here perhaps we need custom models: Headers, Request, Response, Cookies (in _models.py
)
# _models.py:
@dataclass
class ClientModels:
headers: Headers = Headers
request: Request = Request
response: Response = Response
cookies: Cookies = Cookies
DEFAULT_MODELS = ClientModels()
Then we pass our custom models to top level client instance:
class OrjsonResponse(httpx.Response):
def json(self, **kwargs):
return orjson.loads(self.text, **kwargs)
models = httpx.ClientModels(response=OrjsonResponse)
with httpx.Client(models=models) as c:
resp = c.get("https://example.org")
fast_json = resp.json()
I'm hitting this issue as I type, and yikes, this is so complicated. 95% of use-cases would be solved if you could just do something like httpx.set_json_handlers(loads=orjson.loads, dumps=orjson.dumps)
. I'm not doing this on a per Client basis. If I'm using orjson, I'm using orjson everywhere. Also, if I need to be fancy, I can wrap it up in another function.
95% of use-cases would be solved if you could just do something like
httpx.set_json_handlers(loads=orjson.loads, dumps=orjson.dumps)
I do see that. The issue with that approach is that you introduce subtly different JSON handling at a distance. Installing a new dependancy to your project could end up altering the behaviour of an API client without that being visible anywhere obvious in the project codebase.
I'm not doing this on a per Client basis.
Do you have more than one client instance across the codebase?
Do you have more than one client instance across the codebase?
This is a very normal situation in microservice environment. This is a reason this issue exists.
This comment suggests an API that I wouldn't object too.
Once you've added that code you'd be able to use APIClient
instead of httpx.Client
everywhere throughout the project.
It's not exactly what some of y'all are requesting, but the critical sticking point here is this: I can't see myself doing anything other than veto'ing proposals that use a form of global state.
I strongly agree that global state is not a good path forward for the library. I like the request_class
and response_class
approach — that would also help with some other issues like custom wrappers for response errors.
For those who want to configure the JSON library globally in your projects, it'd be trivial to subclass the httpx.Client
as described or wrap client retrieval in a helper method.
Do you have more than one client instance across the codebase?
Well yes, I'm doing with httpx.AsyncClient() as client
all the time.
It'd be trivial to subclass the httpx.Client as described or wrap client retrieval in a helper method.
That's probably what I'll do, just wrap the client. It's not immediately obvious that this is what you should do, though. Maybe make it a recipe in the docs? At least until there is a settled solution.
Overall, I'll say this is a classic case of pragmatism vs purity and I'm not sure a convenience function is where you want to spend cycles achieving purity. But that's not for me to say, and I appreciate your hard work and trust you'll make the best decision. Thank you.
+1. We have datetime.date
objects in our JSON, and while there's nothing wrong with writing a JSON Encoder, it seems not very ideal to have to do
client.get(..., data=json.dumps(..., cls=MyEncoder))
every single time.
My use case is sending pydantic models from my test to a webapp using the httpx client. I am currently doing a roundtrip conversion for each params
or json
argument to get rid of custom types.
@chbndrhnns well, your case seems to be simple: https://docs.pydantic.dev/latest/concepts/serialization/#modelmodel_dump_json -- just overload this (or model_dump
depending on your needs)
your case seems to be simple:
Ok, let's take this as a simplified example for my use case:
import httpx
from pydantic import BaseModel
def test():
class Address(BaseModel):
zip: str
street: str
city: str
payload = {
"name": "me",
"address": Address(zip="0000", street="this street", city="my city")
}
_ = httpx.post("http://127.0.0.1:8000/", json=payload)
It fails unless I call model_dump_json()
on each value which is not a stdlib type
E TypeError: Object of type Address is not JSON serializable
It sounds like you don't own this web app, but normally you should define the payload in pydantic as well and call model_dump() on the root. Easiest way to solve the problem IMO. Then if you want some mental gymnastics, you can override the httpx function to call model_dump() on pydantic models, but that may be a step too far for some maintainers.
If that sounds tedious to you, consider that strong typing is generally a compromise of being tedious in exchange for being correct Pydantic 2 also introduced model_serializer and field_serializer so you don't have to override JSONEncoder
I've not been sufficiently happy with any of the API proposal so far, and I've essentially been veto'ing them.
Let me nudge something here that could be viable(?)...
client = httpx.Client(request_class=..., response_class=...)
I can explain why I (potentially) like that if needed. Perhaps the design sense will speak for itself.
Edit 8th Sept 2023:
That API would allow for this kind of customization...
class APIClient(httpx.Client): request_class = APIRequest response_class = APIResponse class APIRequest(httpx.Request): def __init__(self, *args, **kwargs): if 'json' in kwargs: content = orjson.dumps(kwargs.pop('json')) headers = kwargs.get('headers', {}) headers['Content-Length'] = len(content) kwargs['content'] = content kwargs['headers'] = headers return super().__init__(*args, **kwargs) class APIResponse(httpx.Response): def json(self): return orjson.loads(self.content)
having the ability to customize the response class will be great, any schedule on its implementation? :)
One consideration that I haven't seen proposed - it's entirely reasonable for this library to check if the value of json
is already encoded to bytes. In that case, you can skip calling json.dumps
here: https://github.com/encode/httpx/blob/7354ed70ceb1a0f072af82e2cb784ef6b2512ed3/httpx/_content.py#L176-L181
so it could look something like:
def encode_json(json: Any) -> tuple[dict[str, str], ByteStream]:
body = json if isinstance(json, bytes) else json_dumps(json).encode("utf-8")
content_length = str(len(body))
content_type = "application/json"
headers = {"Content-Length": content_length, "Content-Type": content_type}
return headers, ByteStream(body)
This would allow developers the ability to handle json encoding and decoding external to the library, so we could do something like this:
import httpx
import orjson
mydata = {...}
with httpx.ClientSession() as client:
encoded = orjson.dumps(mydata)
response = client.post("https://fake.url/data/", json=encoded)
result = orjson.loads(response.content)
I know for a fact aiohttp does something similar, so there is precedent here. (It also allows you to pass in a json encoder and decoder, but we've been down that road here.)
If this seems like a reasonable change, I'm happy to make the requisite PR.
Any news? Why not just borrow a solution from aiohttp?
aiohttp.ClientSession(json_serialize=..., ...)
🌟 Introducing HTTPJ! 🚀 It's like HTTPX, but with built-in support for flexible JSON serialization/deserialization!
pip install httpj orjson
import datetime
import pprint
import httpj
import orjson
resp = httpj.post(
"https://postman-echo.com/post",
json={"dt": datetime.datetime.utcnow()},
json_serialize=lambda j: orjson.dumps(j, option=orjson.OPT_NAIVE_UTC), # optional
json_deserialize=orjson.loads, # optional
)
pprint.pprint(resp.json(), indent=4)
p.s.: I'm tired of waiting for this feature for more than 4 years...
I added the following snippet to my client module to allow for passing in bytes as json. Not a huge fan of monkey-patching, but it get the job done.
def _patch_httpx(): # type: ignore
"""Monkey-patch httpx so that we can use our own json ser/des.
https://github.com/encode/httpx/issues/717
"""
from httpx._content import Any, ByteStream, json_dumps
def encode_json(json: Any) -> tuple[dict[str, str], ByteStream]:
body = json if isinstance(json, bytes) else json_dumps(json).encode("utf-8")
content_length = str(len(body))
content_type = "application/json"
headers = {"Content-Length": content_length, "Content-Type": content_type}
return headers, ByteStream(body)
# This makes the above function look and act like the original.
encode_json.__globals__.update(httpx._content.__dict__)
encode_json.__module__ = httpx._content.__name__
httpx._content.encode_json = encode_json
_patch_httpx()
I'm tired of waiting for this feature for more than 4 years...
So, here's an API proposal.
Yep, I'll happily help someone get a pull request merged against that proposal.
Does it mean that httpx.Client
should be now Generic[RequestType, ResponseType]
?
Does it mean that
httpx.Client
should be nowGeneric[RequestType, ResponseType]
?
Please no. 😬
One consideration that I haven't seen proposed - it's entirely reasonable for this library to check if the value of
json
is already encoded to bytes. In that case, you can skip callingjson.dumps
here: [...]
Custom request & response classes would be great but this trivial change change would also be useful. 👍
Hi!
Is there a way to use an alternative JSON library to decode the request? Like orjson for example?
Thanks!