RDFLib / rdflib

RDFLib is a Python library for working with RDF, a simple yet powerful language for representing information.
https://rdflib.readthedocs.org
BSD 3-Clause "New" or "Revised" License
2.15k stars 555 forks source link

Fix and extend implementation of `BytesIOWrapper` #2853

Closed ashleysommer closed 2 months ago

ashleysommer commented 2 months ago

Fix BytesIOWrapper to make it more complete, more useful, and compliant with Python's BufferedIOBase base class and BinaryIO typing, also compatible with Typeshed's _WrappedBuffer protocol.

This originally started out as a small fix for an incorrect handling of an edge case in FileInputSource, but found it wasn't so simple because BytesIOWrapper only wrapped str instances, not TextIO streams. (I expected it would, because the counterpart TextIOWrapper does wrap BinaryIO streams). While diving in to fix that too, I also found that many methods on BytesIOWrapper were either throwing NotImplementedError() or simply missing. So BytesIOWrapper was not aligned with its superclass BufferedIOBase, and not compliant with the python BinaryIO interface.

I implemented all of the missing and unimplemented methods, and reworked it to support wrapping both string instances and TextIO files and streams.

After these changes, BytesIOWrapper can now also wrap a TextIO Unicode file, or a StringIO, as well as the original wrapped str.

I implemented @property name() and others like readable() and writable() on BytesIOWrapper because TextIOWrapper expects it. So you could now use BytesIOWrapper as a backing buffer to TextIOWrapper but really don't do that.

I've also added a bunch more typing, and some tests for BytesIOWrapper that were missing, hopefully will increase our Coverage number.

coveralls commented 2 months ago

Coverage Status

coverage: 90.748%. first build when pulling 14ad41ea3951769e4c65299dd961ced62e02d60e on bytesio_wrapper_uplift into 2053ecdeb3cfe4f4ae24e385feeea3e1dc442d10 on main.

ashleysommer commented 2 months ago

Fixed failing test introduced by my second commit.

There will be a follow up task here for down the road. The BytesIOWrapper class is now very long, and should be moved out of the rdflib parser.py file. It is a helper for InputSource but is now taking up more than half of the parser.py file.