Open jaraco opened 9 years ago
I had a piece of code which I distilled to this:
import pickle
with open('out.pickle', 'w') as out:
pickle.dump(out, b'data')
Running that code raises this error:
TypeError: must be str, not bytes
The error is raised at the dump call with no additional context. Based on the error, my reaction is to think that pickled doesn't support bytes objects in pickles.
On further examination, it's not actually that the bytes cannot be pickled, but that the 'dump' call requires that the file be opened in binary mode ('wb'), but because of the error message essentially says "expecting a text string" and because it's unclear that the error is raised during the write to the stream and because the JSON library expects an output stream to be opened in text mode, the error message is misleading.
At least two other people think the behavior could be clearer.
Would it be possible and reasonable to trap a TypeError at the call to .write
and replace or augment the message with something like "file must be opened in binary mode"?
On second thought, perhaps the culprit isn't pickle here, but the stream writer. Perhaps the .write
method should provide a clearer message indicating the context at which it's expecting str and not bytes.
I think it is worth to improve the error message in the write() method of binary files.
>>> sys.stdout.write(b'')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: write() argument must be str, not bytes
>>> sys.stdout.buffer.write('')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: a bytes-like object is required, not 'str'
>>> sys.stdout.buffer.raw.write('')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: a bytes-like object is required, not 'str'
But this is large issue. Other file-like objects (GzipFile, ZipExtFile etc) should be updated too.
As I said in bpo-38226, the error message you get when you try to pickle.load() a file opened in "r" mode instead of "rb" mode,
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte
is also confusing and can be improved.
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields: ```python assignee = None closed_at = None created_at =
labels = ['type-feature', '3.8', 'expert-IO']
title = 'Misleading TypeError when pickling bytes to a file opened as text'
updated_at =
user = 'https://github.com/jaraco'
```
bugs.python.org fields:
```python
activity =
actor = 'stein-k'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['IO']
creation =
creator = 'jaraco'
dependencies = []
files = []
hgrepos = []
issue_num = 24159
keywords = []
message_count = 3.0
messages = ['242858', '288157', '355816']
nosy_count = 7.0
nosy_names = ['jaraco', 'pitrou', 'alexandre.vassalotti', 'joncle', 'serhiy.storchaka', 'stein-k', 'boris']
pr_nums = []
priority = 'low'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue24159'
versions = ['Python 3.8']
```