bslatkin / effectivepython

Effective Python: Second Edition — Source Code and Errata for the Book
https://effectivepython.com
2.2k stars 710 forks source link

Item 74: Index assignment into bytes should use integer value instead of bytes value #81

Open bslatkin opened 4 years ago

bslatkin commented 4 years ago

Got this one via an email. Libor writes:

I have a small nitpick regarding bytes indexing - in the code example preceding paragraph starting "The bytearraytype...", you should also put my_bytes[0] = 0x79 since individuals are integers. This is a changed py2/py3 behaviour.

Python 3:

   >>> b"hello"[0]
   104

Python 2:

   >>> b"hello"[0]
   'a'
bslatkin commented 4 years ago

It's kind of an odd edge case because bytes instances can't have their items assigned, so technically assigning to anything is invalid and raises the same exception (and something similar happens in Python 2):

>>> my_bytes = b'hello'
>>> my_bytes[0] = b'\x79'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'bytes' object does not support item assignment
>>> my_bytes[0] = object()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'bytes' object does not support item assignment

But I think the point here is that you would expect to be able to assign indexes into a sequence using the same values that you'd get if you read them. So for bytes it should be integers:

>>> list(b'hello')
[104, 101, 108, 108, 111]
>>> b'hello'[0]
104

What confuses this is that you can pass a bytes literal to the constructor of a bytearray instance in addition to a list of integers (similar to the bytes constructor), and when you repr() a bytearray it will print out the contents like a bytes literal:

>>> x = bytearray(b'foo')
>>> list(x)
[102, 111, 111]
>>> bytearray(list(x))
bytearray(b'foo')

That makes it seem like you could also pass in a single bytes literal to assign an index. But you can't.

>>> x = bytearray(b'foo')
>>> x[0] = b'\x79'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'bytes' object cannot be interpreted as an integer

So I agree that it should be 0x79 instead of b'\x79'.

bibajz commented 4 years ago

Hi Brett,

I am glad you got the gist of my mail - the part you wrote

But I think the point here is that you would expect to be able to assign indexes into a sequence using the same values that you'd get if you read them.

is exactly what I meant. :)

Cheers!