MagicStack / immutables

A high-performance immutable mapping type for Python.
Other
1.14k stars 57 forks source link

Iteration bug #84

Closed molaxx closed 2 years ago

molaxx commented 2 years ago

Hi, I've stumbled upon a bug where I can't access all the entries in the immutables.Map via iteration. The following bash script should reproduce it. I tested it on: Python 3.9.5 (default, Nov 23 2021, 15:27:38) [GCC 9.3.0]

Python 3.9.9 (main, Nov 21 2021, 03:23:42) [Clang 13.0.0 (clang-1300.0.29.3)]

PYTHONHASHSEED=0 python <<EOF
import itertools
import immutables
import random
seed=b'b\xe6\xe2\x82\xe5\xc1e|'
r = random.Random(seed)
a = immutables.Map(
    zip(
        (r.randrange(0, 10000000000) for i in range(820000)),
        itertools.repeat(None, 820000),
    )
)

len1 = len(a)
len2 = len(tuple(a))
if len1 != len2:
    print(f"BADDDD seed:{seed} len(a)={len1} len(tuple(a))={len2}")

EOF

I'll be happy to get help debugging this.

Thanks you Eli

molaxx commented 2 years ago

From debugging this, it seems that the assumption that the assumed invariant: iter->i_level < 7 is wrong. while there could be at most 7 levels for a 32bit hash (65bit + 12bit), but collision nodes may create an indirection when iter->i_level == 6, thus create a new level.

I would expect an SegFault and not just premature iteration stop, but it seems that the assertion in the code (_map.c:2318) together with compiler optimization avoids the seg fault, but sets iter->i_level to 0. which stops the iteration.

I created a pull request fixing this: https://github.com/MagicStack/immutables/pull/85

1st1 commented 2 years ago

Very good catch and thanks for filing this. I'll take a look at the fix.