RDFLib / rdflib

RDFLib is a Python library for working with RDF, a simple yet powerful language for representing information.
https://rdflib.readthedocs.org
BSD 3-Clause "New" or "Revised" License
2.15k stars 555 forks source link

Add optional orjson support for faster json reading and writing #2854

Closed ashleysommer closed 2 months ago

ashleysommer commented 2 months ago

This adds optional support for orjson, that can be enabled by installing rdflib with pip extras syntax like rdflib[orjson], or poetry extras syntax like --extras orjson, or finally it will be detected and used if you simply install orjson>=3.9.14 in your python environment.

This PR touches a lot of files, because JSON is surprisingly used in a whole lot of different places in rdflib.

There are also some tangential non-JSON related changes to stream handling in a bunch of other SPARQLResult serializers. While implementing the orjson support for sparql-results-json serializer I found some errors in the way all of the different Sparql-Results-Serializers treat TextIO and BinaryIO streams. This was causing 7 errors to be thrown by the rdflib serializer tests, but they were marked as ignored in the test suite.

These additional changes include much better Typing to the Sparql-Results-Serializer subclasses, which exposed where the problems were (hooray for typed Python exposing actual errors). Fixes were made to all of the failing Sparql-Results-Serializer subclasses, and there are no skipped tests now. This also allowed the removal of a bunch of mypy type: ignore patches that were in place to silence the complaining type checker.

I know it would be great to move all those additional changes to a different PR, but there are two reasons I didnt: 1) The addition of the orjson feature relies on those typing changes and BinaryIO stream fixes. 2) The sparql-results-serializer fixes for specifically the sparql-results-json (SparqlResultsJson) subclass is too entangled with the orjson feature to be able to be extracted easily.

Fixes #2784

ashleysommer commented 2 months ago

Looks like I need to update this a bit to resolve some conflicts with the changes that were introduced with the "JSON-LD from HTML" feature that was recently merged. I'll get to that later today or tomorrow.

coveralls commented 2 months ago

Coverage Status

coverage: 90.633% (-0.1%) from 90.748% when pulling 14e4f956d896590c79d3db5ff7ee4fdd1627879c on orjson_support into 563dfccc4bdfa5df566907988322d497af103e33 on main.

ashleysommer commented 2 months ago

Finally, all tests and lints passing. @nicholascar I'm merging this now.