python / cpython

The Python programming language
https://www.python.org
Other
62.61k stars 30.05k forks source link

Misleading TypeError when pickling bytes to a file opened as text #68347

Open jaraco opened 9 years ago

jaraco commented 9 years ago
BPO 24159
Nosy @jaraco, @pitrou, @avassalotti, @serhiy-storchaka, @stein-k, @verhovsky

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields: ```python assignee = None closed_at = None created_at = labels = ['type-feature', '3.8', 'expert-IO'] title = 'Misleading TypeError when pickling bytes to a file opened as text' updated_at = user = 'https://github.com/jaraco' ``` bugs.python.org fields: ```python activity = actor = 'stein-k' assignee = 'none' closed = False closed_date = None closer = None components = ['IO'] creation = creator = 'jaraco' dependencies = [] files = [] hgrepos = [] issue_num = 24159 keywords = [] message_count = 3.0 messages = ['242858', '288157', '355816'] nosy_count = 7.0 nosy_names = ['jaraco', 'pitrou', 'alexandre.vassalotti', 'joncle', 'serhiy.storchaka', 'stein-k', 'boris'] pr_nums = [] priority = 'low' resolution = None stage = None status = 'open' superseder = None type = 'enhancement' url = 'https://bugs.python.org/issue24159' versions = ['Python 3.8'] ```

jaraco commented 9 years ago

I had a piece of code which I distilled to this:

import pickle
with open('out.pickle', 'w') as out:
    pickle.dump(out, b'data')

Running that code raises this error:

TypeError: must be str, not bytes

The error is raised at the dump call with no additional context. Based on the error, my reaction is to think that pickled doesn't support bytes objects in pickles.

On further examination, it's not actually that the bytes cannot be pickled, but that the 'dump' call requires that the file be opened in binary mode ('wb'), but because of the error message essentially says "expecting a text string" and because it's unclear that the error is raised during the write to the stream and because the JSON library expects an output stream to be opened in text mode, the error message is misleading.

At least two other people think the behavior could be clearer.

Would it be possible and reasonable to trap a TypeError at the call to .write and replace or augment the message with something like "file must be opened in binary mode"?

On second thought, perhaps the culprit isn't pickle here, but the stream writer. Perhaps the .write method should provide a clearer message indicating the context at which it's expecting str and not bytes.

serhiy-storchaka commented 7 years ago

I think it is worth to improve the error message in the write() method of binary files.

>>> sys.stdout.write(b'')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: write() argument must be str, not bytes
>>> sys.stdout.buffer.write('')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: a bytes-like object is required, not 'str'
>>> sys.stdout.buffer.raw.write('')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: a bytes-like object is required, not 'str'

But this is large issue. Other file-like objects (GzipFile, ZipExtFile etc) should be updated too.

d2c7be43-4ab7-496a-b6b2-ef2e8178767a commented 4 years ago

As I said in bpo-38226, the error message you get when you try to pickle.load() a file opened in "r" mode instead of "rb" mode,

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte

is also confusing and can be improved.