Hi there, I've come across an infinite loop inside ipykernel which I can reliably reproduce when using multiple concurrent.futures.ThreadPoolExecutors inside a notebook with ipykernel >= 6.29.0. This results in the execution of the relevant cell hanging forever.
This looks to have been introduced in https://github.com/ipython/ipykernel/pull/1186, which added the handling around self._thread_to_parent. I believe this will still be present on main.
Thread identifiers may be recycled when a thread exits and another thread is created.
Imagine a thread A with ident 1 having a child thread B with ident 2. Once those threads exit, there's nothing preventing a thread C with ident 2 having a child thread D with ident 1.
As the stdout and stderr streams are long-lived, you can therefore end up with a cycle in self._thread_to_parent, which causes this infinite loop. I've proved that this is the case with some hacked logging in the relevant places:
while identity in self._thread_to_parent:
+ import datetime
+ with open("debug.log", "a") as f:
+ f.write(f"{datetime.datetime.now()}: {identity} is in self._thread_to_parent. Value is {self._thread_to_parent[identity]}\n")
identity = self._thread_to_parent[identity]
and
else:
+ import datetime
+ is_stdout = stream is stdout
+ is_stderr = stream is stderr
+ with open("debug.log", "a") as f:
+ f.write(f"{datetime.datetime.now()}: Writing {self.ident}: {parent} into stream._thread_to_parent. Stream is {stream}. {is_stdout=} {is_stderr=} stream._thread_to_parent is {stream._thread_to_parent}\n")
stream._thread_to_parent[self.ident] = parent
This gives something like the below, where we end up with an infinite loop due to the cycle of [140278168012352, 140277658416704]:
The first link (stderr) ----------> 2024-10-02 13:34:26.043559: Writing 140277658416704: 140278168012352 into stream._thread_to_parent. Stream is <ipykernel.iostream.OutStream object at 0x7f9527f63be0>. is_stdout=True is_stderr=False stream._thread_to_parent is {140278159619648: 140278168012352, 140278151226944: 140278168012352, 140278142834240: 140278168012352, 140277658416704: 140278168012352, 140277650024000: 140278168012352}
2024-10-02 13:34:26.043658: Writing 140278142834240: 140278168012352 into stream._thread_to_parent. Stream is <ipykernel.iostream.OutStream object at 0x7f9527f63a30>. is_stdout=False is_stderr=True stream._thread_to_parent is {140278159619648: 140278168012352, 140278151226944: 140278168012352, 140278142834240: 140278168012352, 140277658416704: 140278168012352, 140277650024000: 140278168012352}
2024-10-02 13:34:26.043746: Writing 140278151226944: 140278168012352 into stream._thread_to_parent. Stream is <ipykernel.iostream.OutStream object at 0x7f9527f63be0>. is_stdout=True is_stderr=False stream._thread_to_parent is {140278159619648: 140278168012352, 140278151226944: 140278168012352, 140278142834240: 140278168012352, 140277658416704: 140278168012352, 140277650024000: 140278168012352}
2024-10-02 13:34:26.043984: 140278159619648 is in self._thread_to_parent. Value is 140278168012352
2024-10-02 13:34:26.044021: Writing 140277650024000: 140278168012352 into stream._thread_to_parent. Stream is <ipykernel.iostream.OutStream object at 0x7f9527f63be0>. is_stdout=True is_stderr=False stream._thread_to_parent is {140278159619648: 140278168012352, 140278151226944: 140278168012352, 140278142834240: 140278168012352, 140277658416704: 140278168012352, 140277650024000: 140278168012352}
2024-10-02 13:34:26.044236: Writing 140278151226944: 140278168012352 into stream._thread_to_parent. Stream is <ipykernel.iostream.OutStream object at 0x7f9527f63a30>. is_stdout=False is_stderr=True stream._thread_to_parent is {140278159619648: 140278168012352, 140278151226944: 140278168012352, 140278142834240: 140278168012352, 140277658416704: 140278168012352, 140277650024000: 140278168012352}
The first link (stderr) ----------> 2024-10-02 13:34:26.044457: Writing 140277658416704: 140278168012352 into stream._thread_to_parent. Stream is <ipykernel.iostream.OutStream object at 0x7f9527f63a30>. is_stdout=False is_stderr=True stream._thread_to_parent is {140278159619648: 140278168012352, 140278151226944: 140278168012352, 140278142834240: 140278168012352, 140277658416704: 140278168012352, 140277650024000: 140278168012352}
2024-10-02 13:34:26.044587: Writing 140277650024000: 140278168012352 into stream._thread_to_parent. Stream is <ipykernel.iostream.OutStream object at 0x7f9527f63a30>. is_stdout=False is_stderr=True stream._thread_to_parent is {140278159619648: 140278168012352, 140278151226944: 140278168012352, 140278142834240: 140278168012352, 140277658416704: 140278168012352, 140277650024000: 140278168012352}
2024-10-02 13:34:26.044749: 140278142834240 is in self._thread_to_parent. Value is 140278168012352
2024-10-02 13:34:26.045033: 140277658416704 is in self._thread_to_parent. Value is 140278168012352
2024-10-02 13:34:26.045135: 140278142834240 is in self._thread_to_parent. Value is 140278168012352
2024-10-02 13:34:26.045221: 140278151226944 is in self._thread_to_parent. Value is 140278168012352
2024-10-02 13:34:26.045318: 140278159619648 is in self._thread_to_parent. Value is 140278168012352
2024-10-02 13:34:26.045531: 140277650024000 is in self._thread_to_parent. Value is 140278168012352
2024-10-02 13:34:26.045678: 140277658416704 is in self._thread_to_parent. Value is 140278168012352
2024-10-02 13:34:26.045846: 140277650024000 is in self._thread_to_parent. Value is 140278168012352
2024-10-02 13:34:26.045929: 140278151226944 is in self._thread_to_parent. Value is 140278168012352
2024-10-02 13:34:26.147930: 140277658416704 is in self._thread_to_parent. Value is 140278168012352
2024-10-02 13:34:26.148022: 140277658416704 is in self._thread_to_parent. Value is 140278168012352
2024-10-02 13:34:26.148076: 140277658416704 is in self._thread_to_parent. Value is 140278168012352
2024-10-02 13:34:26.148108: 140277658416704 is in self._thread_to_parent. Value is 140278168012352
2024-10-02 13:34:26.155890: 140278151226944 is in self._thread_to_parent. Value is 140278168012352
Completing the cycle (stdout) ----> 2024-10-02 13:34:26.155942: Writing 140278168012352: 140277658416704 into stream._thread_to_parent. Stream is <ipykernel.iostream.OutStream object at 0x7f9527f63be0>. is_stdout=True is_stderr=False stream._thread_to_parent is {140278159619648: 140278168012352, 140278151226944: 140278168012352, 140278142834240: 140278168012352, 140277658416704: 140278168012352, 140277650024000: 140278168012352}
Completing the cycle (stderr) ----> 2024-10-02 13:34:26.156027: Writing 140278168012352: 140277658416704 into stream._thread_to_parent. Stream is <ipykernel.iostream.OutStream object at 0x7f9527f63a30>. is_stdout=False is_stderr=True stream._thread_to_parent is {140278159619648: 140278168012352, 140278151226944: 140278168012352, 140278142834240: 140278168012352, 140277658416704: 140278168012352, 140277650024000: 140278168012352}
2024-10-02 13:34:26.156131: 140278168012352 is in self._thread_to_parent. Value is 140277658416704
2024-10-02 13:34:26.156171: 140278168012352 is in self._thread_to_parent. Value is 140277658416704
2024-10-02 13:34:26.156212: 140277658416704 is in self._thread_to_parent. Value is 140278168012352
2024-10-02 13:34:26.156244: 140278168012352 is in self._thread_to_parent. Value is 140277658416704
2024-10-02 13:34:26.156281: 140277658416704 is in self._thread_to_parent. Value is 140278168012352
2024-10-02 13:34:26.156308: 140278168012352 is in self._thread_to_parent. Value is 140277658416704
2024-10-02 13:34:26.156334: 140277658416704 is in self._thread_to_parent. Value is 140278168012352
2024-10-02 13:34:26.156358: 140278168012352 is in self._thread_to_parent. Value is 140277658416704
2024-10-02 13:34:26.156392: 140277658416704 is in self._thread_to_parent. Value is 140278168012352
2024-10-02 13:34:26.156459: 140277658416704 is in self._thread_to_parent. Value is 140278168012352
2024-10-02 13:34:26.156501: 140278168012352 is in self._thread_to_parent. Value is 140277658416704
2024-10-02 13:34:26.156525: 140277658416704 is in self._thread_to_parent. Value is 140278168012352
2024-10-02 13:34:26.156547: 140278168012352 is in self._thread_to_parent. Value is 140277658416704
2024-10-02 13:34:26.156570: 140277658416704 is in self._thread_to_parent. Value is 140278168012352
...infinite loop
Reproduction
Apologies, it's probably not the most minimal reproduction, but this cell will reliably reproduce this infinite loop. It might take a couple of minutes to do so, but you'll see the output stop when it happens (and if you use the above added debugging, will see it clearly start infinite-looping)
import time
import threading
from concurrent import futures
import datetime
def func(*args, **kwargs):
# Need this first layer of indirection to get into the `else` of https://github.com/ipython/ipykernel/blob/0fa543936d3f9ee69f6b9f8bd5a316692fde9aaa/ipykernel/ipkernel.py#L742
print(f"Launching with {threading.get_ident()}")
with futures.ThreadPoolExecutor(5) as executor:
print(list(executor.map(innermost_func, [1, 2, 3, 4, 5])))
def innermost_func(*args, **kwargs):
print(f"Hello from thread {threading.get_ident()}")
time.sleep(0.1)
count = 0
while True:
count += 1
print(f"{datetime.datetime.now()} Iteration number {count}")
threads = 1
print(f"Launching with {threading.get_ident()}")
with futures.ThreadPoolExecutor(threads) as executor:
print(list(executor.map(func, [1])))
Hi there, I've come across an infinite loop inside ipykernel which I can reliably reproduce when using multiple
concurrent.futures.ThreadPoolExecutor
s inside a notebook with ipykernel >= 6.29.0. This results in the execution of the relevant cell hanging forever.This looks to have been introduced in https://github.com/ipython/ipykernel/pull/1186, which added the handling around
self._thread_to_parent
. I believe this will still be present onmain
.The infinite loop occurs here: https://github.com/ipython/ipykernel/blob/0fa543936d3f9ee69f6b9f8bd5a316692fde9aaa/ipykernel/iostream.py#L538-L539
when there's a cycle in
self._thread_to_parent
.Origin of cycle
ipykernel
usesthreading.get_ident()
to grab a (temporarily!) unique identifier for athreading.Thread
thread when it runs, and stash it in aself._thread_to_parent
dict on stdout/stderr: https://github.com/ipython/ipykernel/blob/0fa543936d3f9ee69f6b9f8bd5a316692fde9aaa/ipykernel/ipkernel.py#L747The
threading.get_ident()
function statesImagine a thread
A
with ident1
having a child threadB
with ident2
. Once those threads exit, there's nothing preventing a threadC
with ident2
having a child threadD
with ident1
.As the stdout and stderr streams are long-lived, you can therefore end up with a cycle in
self._thread_to_parent
, which causes this infinite loop. I've proved that this is the case with some hacked logging in the relevant places:and
This gives something like the below, where we end up with an infinite loop due to the cycle of [140278168012352, 140277658416704]:
Reproduction
Apologies, it's probably not the most minimal reproduction, but this cell will reliably reproduce this infinite loop. It might take a couple of minutes to do so, but you'll see the output stop when it happens (and if you use the above added debugging, will see it clearly start infinite-looping)
My environment