kislyuk / eight

Python 2 to the power of 3
https://eight.readthedocs.org/
Apache License 2.0
47 stars 6 forks source link

mode='t' support for gzip, bz2 & lzma #1

Open maxgrenderjones opened 9 years ago

maxgrenderjones commented 9 years ago

The read-compressed-file modules all added a 't' mode for opening files in text mode at the same time 't' was added as a mode for plain vanilla open. Is it possible to monkey-patch them so that they work too?

P.S. sadly the answer isn't as simple as wrapping them in a TextIOWrapper as they only partially support the file protocol:

>>> sys.version
'2.7.9 (default, Apr  7 2015, 07:58:25) \n[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)]'
>>> with gzip.open('foo.gz', 'r') as gz:
...     with io.TextIOWrapper(gz, encoding='utf8') as f:
...             for line in f:
...                     print(f)
... 
Traceback (most recent call last):
  File "<stdin>", line 3, in <module>
io.UnsupportedOperation: read1

(same issue on python 3.2 - works fine on 3.3)

One workaround that wfm (found at https://bugs.python.org/issue12591) is to alias read to read1 like so:

>>> with gzip.open('passwd.gz', 'r') as gz:
...     gz.read1=gz.read
...     with io.TextIOWrapper(gz, encoding='utf8') as f:
...         for line in f: 
...             print(line)
maxgrenderjones commented 9 years ago

Having spent a while trying to fix this, there are a series of challenges:

One option would be to use external packages that work around this issue:

kislyuk commented 9 years ago

Thanks for looking into it! My philosophy is actually to only support 2.7 and 3.3+. I'll play around with a possible set of wrappers today, but the challenges you found are good to know.