python / cpython

The Python programming language
https://www.python.org
Other
63.47k stars 30.39k forks source link

Segmentation fault with tuple + gc.collect() #51715

Closed 78a6bf11-a01d-46d5-ba37-0e5218983f8f closed 14 years ago

78a6bf11-a01d-46d5-ba37-0e5218983f8f commented 14 years ago
BPO 7466
Nosy @pitrou, @skrah, @florentx
Files
  • issue7466.diff: Patch, apply to trunk and py3k
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = 'https://github.com/pitrou' closed_at = created_at = labels = ['interpreter-core', 'type-crash'] title = 'Segmentation fault with tuple + gc.collect()' updated_at = user = 'https://bugs.python.org/LambertDW' ``` bugs.python.org fields: ```python activity = actor = 'pitrou' assignee = 'pitrou' closed = True closed_date = closer = 'pitrou' components = ['Interpreter Core'] creation = creator = 'LambertDW' dependencies = [] files = ['15521'] hgrepos = [] issue_num = 7466 keywords = ['patch'] message_count = 11.0 messages = ['96190', '96191', '96197', '96199', '96207', '96208', '96209', '96210', '96262', '96263', '96302'] nosy_count = 4.0 nosy_names = ['pitrou', 'LambertDW', 'skrah', 'flox'] pr_nums = [] priority = 'high' resolution = 'fixed' stage = 'resolved' status = 'closed' superseder = None type = 'crash' url = 'https://bugs.python.org/issue7466' versions = ['Python 3.1', 'Python 2.7', 'Python 3.2'] ```

    78a6bf11-a01d-46d5-ba37-0e5218983f8f commented 14 years ago

    ''' This brute [possibly a] solution to

    http://projecteuler.net/index.php?section=problems&id=159
    
    causes segmentation fault.
        $ p3 # an AMD 64 bit build.
        Python 3.1.1 (r311:74480, Oct  2 2009, 12:29:57) 
        [GCC 4.3.3] on linux2
        Type "help", "copyright", "credits" or "license" for more 
    information.
        >>>
    The block between "load_primes" and "print(result)" can be written
    as a single statement using various comprehensions.  sum(max(...))
    The program does not seg-fault this way, but the result was wrong.
    I unrolled the code to fix my algorithm, disclosing the segmentation 

    fault. '''

    import functools
    import operator
    import itertools
    import array
    
    PrimeQ = Primes = 'use load_primes(n) function'
    
    def load_primes(n):
        global PrimeQ,Primes
        PrimeQ = sieve(1+n)
        Primes = array.array('L',(i for (i,Q,) in enumerate(PrimeQ) if Q))
    
    def sieve(n):
        a = array.array('b',(True,))*n
        a[0] = a[1] = False
        for (i,j) in enumerate(a):
            if j:
                for k in range(i**2,n,i):
                    a[k] = False
        return a
    
    def PrimeRange(a):
        '''
            see "load_primes"
        '''
        n = 1+int(a**(1/2))
        for p in Primes:
            if n < p:
                raise StopIteration
            yield p
    
    def PrimeFactor(a):
        '''
            see "load_primes"
            >>> load_primes(30)
            >>> print([PrimeFactor(x)for x in (6,7,)])
            [[2, 3], [7]]
        '''
        if (a < len(PrimeQ)) and PrimeQ[a]:
            return [a]
        for p in PrimeRange(a):
            (q,r,) = divmod(a,p)
            if not r:
                return [p]+PrimeFactor(q)
        return [a]
    def product(a):
        return functools.reduce(operator.mul,a,1)
    
    def digital_root(n):
        while 9 < n:
            n = sum(map(int,str(n)))
        return n
    
    def partition(L, chain=itertools.chain):
        '''
            python recipe by Ray Hettinger
        '''
        s = L
        n = len(s)
        first, middle, last = [0], range(1, n), [n]
        return [[L[a:b] for (a,b) in zip(chain(first, div), chain(div, 
    last))]
                for i in range(n) for div in itertools.combinations(middle, 
    i)]
    
    load_primes(1000)
    
    s = 0
    for n in range(2,10**6):
        factorizations = [
            [product(p)for p in group]for group in 
    partition(PrimeFactor(n))]
        mx = 0
        for factorization in factorizations:
            digital_roots = tuple(map(digital_root,factorization))
            sdr = sum(digital_roots)
            mx = max(mx,sdr)
        s += mx
    
    print('result!',s)
    78a6bf11-a01d-46d5-ba37-0e5218983f8f commented 14 years ago

    Further isolation, following change removes segmentation fault:

            digital_roots = tuple(map(digital_root,factorization))

    becomes

            digital_roots = [digital_root(factor) for factor in 
    factorization]
    5531d0d8-2a9c-46ba-8b8b-ef76132a492c commented 14 years ago

    Segfault confirmed on 64 bit Ubuntu, Python 3.2a0:

    Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7f5074dea6e0 (LWP 11665)] 0x000000000042111b in _PyTuple_Resize (pv=0x7fff7ce03b10, newsize=25) at Objects/tupleobject.c:853 853 _PyObject_GC_UNTRACK(v);

    eda57068-96ad-4b33-8431-9c528f59a6a6 commented 14 years ago

    Yes, confirmed too. Both 3.1 and 3.2 are broken.

    Minimal sequence to trigger the bug:

    """ ~ $ ./python Python 3.2a0 (py3k:76743M, Dec 10 2009, 12:19:41) [GCC 4.3.2] on linux2 Type "help", "copyright", "credits" or "license" for more information.

    """ def m(i): if i == 10: return next(i for i in '0')

    MAX_LOOP = 1<<10
    t = range(11)
    for i in range(MAX_LOOP):
      d = map(m, t)
      tup = tuple(d)

    ## python: Objects/tupleobject.c:853: _PyTuple_Resize: ## Assertion `g->gc.gc_refs != (-2)' failed.

    eda57068-96ad-4b33-8431-9c528f59a6a6 commented 14 years ago

    Fix is coming soon...

    ## Sample code:

    import gc
    tuple(gc.collect() for i in range(11))

    ## python: Objects/tupleobject.c:853: _PyTuple_Resize: ## Assertion `g->gc.gc_refs != (-2)' failed.

    And Trunk is affected

    pitrou commented 14 years ago

    An even simpler way to reproduce:

    >>> import gc
    >>> def g(it):
    ...   for x in it:
    ...     gc.collect()
    ...     yield x
    ... 
    >>> t = tuple(g(range(100)))
    Erreur de segmentation

    This is probably caused by the GC optimizations I introduced in trunk and 3.1.

    pitrou commented 14 years ago

    Ok, you've been faster than me, flox :-)

    eda57068-96ad-4b33-8431-9c528f59a6a6 commented 14 years ago

    Well... here comes the patch.

    pitrou commented 14 years ago

    IMO it would be better to call _PyObject_GC_IS_TRACKED() explicitly rather than rely on a subtle difference between the two APIs as in the current patch.

    eda57068-96ad-4b33-8431-9c528f59a6a6 commented 14 years ago

    Ok. It is here: http://bugs.python.org/file15520/issue7466.diff

    pitrou commented 14 years ago

    Committed, thanks!