numba / numba

NumPy aware dynamic Python compiler using LLVM
https://numba.pydata.org/
BSD 2-Clause "Simplified" License
9.62k stars 1.11k forks source link

Segfault in jitted class when deleting and appending list elements #5687

Open aldenwalker opened 4 years ago

aldenwalker commented 4 years ago

The following code causes a segfault for me (on a plain AWS ubuntu EC2 instance):

import numba
@numba.experimental.jitclass([('L', numba.types.List(numba.types.Set(numba.types.int64))),
                              ('D', numba.types.DictType(numba.types.int64, numba.types.int64))])
class ListnDict(object):
    def __init__(self):
        self.L       = [set([0])]
        del self.L[0]
        self.L.append(set([0]))
        self.D       = {0:0};
x = ListnDict()
x.D

Here is my python session:

Python 3.6.9 (default, Apr 18 2020, 01:56:04) 
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numba
>>> @numba.experimental.jitclass([('L', numba.types.List(numba.types.Set(numba.types.int64))),
...                               ('D', numba.types.DictType(numba.types.int64, numba.types.int64))])
... class ListnDict(object):
...     def __init__(self):
...         self.L       = [set([0])]
...         del self.L[0]
...         self.L.append(set([0]))
...         self.D       = {0:0};
... 
>>> x = ListnDict()
>>> x.L
[{0}]
>>> x.D
Segmentation fault (core dumped)

I did not try all possible combinations of arguments, but when I removed the field D or changed it to an int64, I did not experience the segfault. The segfault also does not occur if I do the del in the python session rather than a jitted method:

>>> import numba
>>> @numba.experimental.jitclass([('L', numba.types.List(numba.types.Set(numba.types.int64))),
...                               ('D', numba.types.DictType(numba.types.int64, numba.types.int64))])
... class ListnDict(object):
...     def __init__(self):
...         self.L       = [set([0])]
...         self.D       = {0:0};
... 
>>> x = ListnDict()
>>> x.L
[{0}]
>>> x.D
DictType[int64,int64]({0: 0})
>>> del x.L[0]
>>> x.D
DictType[int64,int64]({0: 0})
>>> x.L.append([set([0])])
>>> x.L
[{0}]
>>> x.D
DictType[int64,int64]({0: 0})

I am not sure whether I should be using numba.types.ListType as an argument to @jitclass instead of numba.types.List. When I do that, the compilation does not like the line self.L = [set([0])]:

... Cannot cast list(set(int64)) to ListType[set(int64)] ...

I think I should fix this compilation error by creating an empty list like:

self.L = numba.typed.List.empty_list(numba.types.Set(numba.types.int64))

But this has the compilation error

TypingError: Failed in nopython mode pipeline (step: nopython frontend)
Failed in nopython mode pipeline (step: nopython frontend)
Unknown attribute 'Set' of type Module(<module 'numba.core.types' from '/usr/local/lib/python3.6/dist-packages/numba/core/types/__init__.py'>)

It is completely plausible to me that I am handling something incorrectly with the types, but I think I should get some error other than a segfault.

stuartarchibald commented 4 years ago

Thanks for the report. The types you are using are correctly describing what you are trying to do. I think there's a bug to do with reference counting present:

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7dfd017 in nrt_atomic_sub ()
(gdb) bt
#0  0x00007ffff7dfd017 in nrt_atomic_sub ()
#1  0x00007fffefcaeb6a in NRT_MemInfo_release ()

even just this segfaults for me:

import numba

@numba.experimental.jitclass([('L', numba.types.List(numba.types.Set(numba.types.int64))),])
class ListnDict(object):
    def __init__(self):
        self.L       = [set([0])]
        del self.L[0] # <-- suspect refcount goes to 0
        self.L.append(set([0])) # <-- suspect invalid access to dead reference

x = ListnDict()
x.L

This example uses Numba's typed.List container which fairs better:

import numba

lty = numba.types.Set(numba.types.int64)

@numba.experimental.jitclass([('L', numba.types.ListType(lty)),
                              ('D', numba.types.DictType(numba.types.int64, numba.types.int64))])
class ListnDict(object):
    def __init__(self):
        self.L       = numba.typed.List.empty_list(lty)
        self.L.append(set([0]))
        del self.L[0]
        self.L.append(set([0]))
        self.D = {0: 0}

x = ListnDict()
x.D