python / cpython

The Python programming language
https://www.python.org
Other
62.14k stars 29.86k forks source link

del _limbo[self] KeyError #72095

Open 1e3445ee-1357-417d-a204-0863cf864f91 opened 8 years ago

1e3445ee-1357-417d-a204-0863cf864f91 commented 8 years ago
BPO 27908
Nosy @tim-one, @dimaqq, @rooterkyberian
Files
  • issue27908.patch
  • issue27908_2.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = None closed_at = None created_at = labels = ['3.7', 'type-bug', 'library'] title = 'del _limbo[self] KeyError' updated_at = user = 'https://github.com/dimaqq' ``` bugs.python.org fields: ```python activity = actor = 'rooter' assignee = 'none' closed = False closed_date = None closer = None components = ['Library (Lib)'] creation = creator = 'Dima.Tisnek' dependencies = [] files = ['44643', '44677'] hgrepos = [] issue_num = 27908 keywords = ['patch'] message_count = 6.0 messages = ['274005', '276325', '276331', '276333', '276337', '276605'] nosy_count = 3.0 nosy_names = ['tim.peters', 'Dima.Tisnek', 'rooter'] pr_nums = [] priority = 'normal' resolution = None stage = None status = 'open' superseder = None type = 'behavior' url = 'https://bugs.python.org/issue27908' versions = ['Python 2.7', 'Python 3.5', 'Python 3.6', 'Python 3.7'] ```

    1e3445ee-1357-417d-a204-0863cf864f91 commented 8 years ago

    To reproduce:

    import threading
    import time
    
    class U(threading.Thread):
        def run(self):
            time.sleep(1)
            if not x.ident:
                x.start()
    
    x = U()
    for u in [U() for i in range(10000)]: u.start()
    
    time.sleep(10)

    Chance to reproduce ~20% in my setup. This script has a data race (check then act on x.ident). I expected it to occasionally hit RuntimeError: threads can only be started once

    Instead, I get:

    Unhandled exception in thread started by <bound method Thread._bootstrap of <U(Thread-1, started 139798116361984)>>
    Traceback (most recent call last):
      File "/usr/lib64/python3.5/threading.py", line 882, in _bootstrap
        self._bootstrap_inner()
      File "/usr/lib64/python3.5/threading.py", line 906, in _bootstrap_inner
        del _limbo[self]
    KeyError: <U(Thread-1, started 139798116361984)>
    51920402-d08c-4ba9-b868-f0bbff438904 commented 7 years ago

    Successfully reproduced on 2.7.12 and 3.5.2 . Currently there seems to be no protection against starting the same thread twice at the same time. What was checked was only if start operation already finished once.

    Attached patch makes it so limbo, our starting threads' waiting room, is checked first.

    1e3445ee-1357-417d-a204-0863cf864f91 commented 7 years ago

    @rooter, if you go this way, you should also self._started.set() with lock held, together with removing thread from _limbo

    51920402-d08c-4ba9-b868-f0bbff438904 commented 7 years ago

    @Dima.Tisnek, only reason for having both of these conditions together is so I won't have to repeat the same error message in the code at little price of the performance in this edge case (trying to start the same thread multiple times).

    Unless I'm missing something there should be no way how it would make this lock required for setting self._started.

    1e3445ee-1357-417d-a204-0863cf864f91 commented 7 years ago

    Your logic is accurate; _started is in fact separate from _limbo.

    As such taking a lock for "test-then-set" only would suffice.

    Now when you bring the other primitive under this lock in one place, it would look cleaner if it was also brought in the other.

    There's one more issue with proposed change:

    Before the change, if "already started" exception is ever raised for a given Thread object, then it's guaranteed that that object was started successfully.

    With the change, it is possible that exception is raised, and thread fails to start, leaving the object in initial state.

    If it were up to me, I would buy this limitation as price of safety. Someone may disagree.

    51920402-d08c-4ba9-b868-f0bbff438904 commented 7 years ago

    To address @Dima.Tisnek concern I have changed exception message in case thread start process is merely in progress.

    I kept self._started check under a lock so we can avoid more extreme race condition of one thread checking self._started right before another sets it and exits the limbo.

    As for testing self._started under a lock, but setting it without one. I'm avoiding it only because of performance reasons. The read is super cheap, while notification of .set() is more complex, so if aesthetics are only reasons for doing it there then I would advise against holding that lock while executing it.

    Of course I could also do a self in _active check under a lock, but that is slightly more costly, than self._started check and not any more useful.

    I may be prematurely optimizing things, but people tend to like they threads to startup ASAP.