Open Yhg1s opened 1 year ago
Yes, this is confusing! When a file consumer (c++ based package in my case) works with the fileno() directly, and you want to add a wrapper like LZMAFile
for transparent decompression, it will give you errors because the data is comproessed. A fileno should only be provided if it gives an emulation for decompressed data (like when using an os pipe object).
Duplicate of #68546
The various compressing/decompressing file wrappers (
bz2.BZ2File
,gzip.GZipFile
,lzma.LZMAFile
) currently havefileno
methods that return the underlying file descriptor: https://github.com/python/cpython/blob/0a4c82ddd34a3578684b45b76f49cd289a08740b/Lib/bz2.py#L126-L129I imagine this was done because it seemed useful, but I'm not sure what use it is. You can't safely use things like
select
since the compression/decompression might buffer, and passing it to things that use the file descriptor directly will produce garbage (when reading) or corrupt the file (when writing).An example how misleading this can be, courtesy of @ericfrederich:
Note the (empty) bz2 data after the data written by the subprocess.
Am I missing a situation where this is actually useful? If there isn't one, can we consider adding a warning for the confusing behaviour?