HENNGE / aiodynamo

Asynchronous, fast, pythonic DynamoDB Client
https://aiodynamo.readthedocs.io/
Other
69 stars 20 forks source link

Add orjson support #90

Closed Tinche closed 2 years ago

Tinche commented 2 years ago

Preliminary PR for optional orjson support. Had to change the http adapter contract slightly.

ojii commented 2 years ago

Thank you for your pull request. Just an idea, but should json loads/dumps just be a configurable option on client and you can bring your own json library?

Tinche commented 2 years ago

Thank you for your pull request. Just an idea, but should json loads/dumps just be a configurable option on client and you can bring your own json library?

That was my initial idea, and we could go that way. I feel like it's overcomplicating it though, the speed difference between the faster implementations is (in my experience) minimal, and this is easier for our users ("if you want a speed up, install orjson and be done with it"). Let me know your thoughts.

ojii commented 2 years ago

That was my initial idea, and we could go that way. I feel like it's overcomplicating it though, the speed difference between the faster implementations is (in my experience) minimal, and this is easier for our users ("if you want a speed up, install orjson and be done with it"). Let me know your thoughts.

The "just install that for better speed" is a very good selling point. As is less configuration options.

My concern is that "faster" JSON libraries pop up somewhat regularly. For example, according to https://pythonrepo.com/repo/TeskaLabs-cysimdjson-python-josn there's already a library (or rather, three libraries) that is much faster than orjson. So shouldn't we pick that? And what happens in a few months when evenfasterjson.py is released?

But then again, JSON parsing isn't exactly the bottleneck here as far as I know, so maybe the minor speedup from an optional orjson is good enough?

@dimaqq you probably have some opinions on this

Tinche commented 2 years ago

I actually tried comparing cysimdjson (they really need a better name) against ujson and orjson independently (I was interested in using it to emit logs) and in my simple benchmarks orjson came out faster. I mean, these are all valid questions and I went with the simplest solution that gets some results.

dimaqq commented 2 years ago

Performance test code: https://github.com/Tinche/aiodynamo/pull/1

ojii commented 2 years ago

I'm sorry, but I'm going to close this PR without merging. I don't think the benefits, minuscule speedup, outweigh the costs, an extra (optional) dependency which would need to be tracked & tested, a change in the HTTP interface and extra documentation.

However, there is already a way you can switch the JSON implementation if you so desire, by creating your own implementation of the HTTP interface which uses an alternative JSON library. This would only apply to API calls and not credentials fetching, though credential fetching should happen relatively rarely. If credentials fetching JSON decoding somehow is a bottleneck, the Credentials interface is also free to be implemented by custom classes which could use different JSON decoders.

While I am not merging this PR, thank you nevertheless for your contribution and I hope this doesn't discourage you from contributing in the future.