quarto-dev / quarto-cli

Open-source scientific and technical publishing system built on Pandoc.
https://quarto.org
Other
3.89k stars 320 forks source link

quarto render failing to close jupyter kernel in `pystan2` code? #3466

Open betanalpha opened 1 year ago

betanalpha commented 1 year ago

Bug description

When trying to render a quarto notebook with Python code from the command line I'm seeing behavior consistent with a persistent jupyter kernel. I would expect that each call to quarto render would run in a new, clean kernel.

For example consider the file test.qmd with the text

---
title: "Test"
author: "Test"
jupyter: python3
---

Test test test.

```{python}
import multiprocessing
multiprocessing.set_start_method("fork")
If I run

quarto render test.qmd --to html

from the command line once then everything renders fine, but I do see some `Python` threads that persist even after the command line prompt has returned.

If I run the same command again then quarto fails with the error

Executing 'test.ipynb' Cell 1/1...ERROR:

An error occurred while executing the following cell:

import multiprocessing multiprocessing.set_start_method("fork")


RuntimeError Traceback (most recent call last) Input In [3], in <cell line: 2>() 1 import multiprocessing ----> 2 multiprocessing.set_start_method("fork")

File /usr/local/Cellar/python@3.9/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/context.py:243, in DefaultContext.set_start_method(self, method, force) 241 def set_start_method(self, method, force=False): 242 if self._actual_context is not None and not force: --> 243 raise RuntimeError('context has already been set') 244 if method is None and force: 245 self._actual_context = None

RuntimeError: context has already been set RuntimeError: context has already been set

Force quitting those dangling Python threads does allow me to render again without problem.  

If I include a cell like 
print(locals())
a = 2
print(locals())

then I also see the `locals()` dictionary growing from call to call, even holding variables from previous renders that are no longer in the notebook, consistent with the kernel persisting from call to call.

Operating System: MacOS 10.15.7 (19H2026)
Python Version:  Python 3.9.6 (default, Jun 29 2021, 06:20:32) [Clang 12.0.0 (clang-1200.0.32.29)] on darwin
Quarto Version: 1.1.251

Any help would be appreciated, and apologies if I'm missing something obvious.  Many thanks for a great project.

### Checklist

- [X] Please include a minimal, fully reproducible example in a single .qmd file? Please provide the whole file rather than the snippet you believe is causing the issue.
- [X] Please [format your issue](https://quarto.org/bug-reports.html#formatting-make-githubs-markdown-work-for-us) so it is easier for us to read the bug report.
- [X] Please document the RStudio IDE version you're running (if applicable), by providing the value displayed in the "About RStudio" main menu dialog?
- [X] Please document the operating system you're running. If on Linux, please provide the specific distribution.
dragonstyle commented 1 year ago

By default for Jupyter, we do run a daemon for the kernel (to address startup performance). More about this, including how you can manage the daemon here:

https://quarto.org/docs/computations/python.html#kernel-daemon

HTH!

betanalpha commented 1 year ago

Trying to turn the daemon off or giving it a time out, either from the command line or within the notebook itself, didn't work; instead 18 Python threads spawned and the rendering froze at the Starting python3 kernel step. On the other hand --execute-daemon-restart did the trick and allowed rendering of multiple --to targets at once.

Thanks for the prompt and informative response!

cscheid commented 1 year ago

On the other hand --execute-daemon-restart did the trick and allowed rendering of multiple --to targets at once.

Just to confirm, you're seeing that adding

execute:
  daemon: false

didn't work for you? If that's the case, this is a bug we should fix...

betanalpha commented 1 year ago

Note that I updated to quarto 1.2.269 just in case I might have been hitting an old issue.

If I reduce everything to simple examples, for instance

---
title: "Test"
author: "Test"
jupyter: python3
execute:
  daemon: false
---

Hello.

```{python}
print("Hello")
or

title: "Test" author: "Test" jupyter: python3 execute: daemon: false

Hello.

import multiprocessing
multiprocessing.set_start_method("fork")
then the daemons appear to behave correctly.  Removing the `execute` configuration from the YAML header but adding `--no-execute-daemon` to the command line call also seems to work correctly.

Unfortunately the more complicated notebook of my interest, https://github.com/betanalpha/mcmc_diagnostics/blob/main/pystan2/mcmc_diagnostics_pystan2.qmd, exhibits the failure mode I mentioned above.  When executing `quarto render mcmc_diagnostics_pystan2.qmd --to html,pdf --no-execute-daemon` the output freezes at 

Starting python3 kernel...


while about 18 Python threads are spun up and sit idle.  I tried using the same YAML header with the bodies of the simpler test notebooks and everything worked fine; I haven't been able to figure out useful elements of the body to isolate given that the execution fails before any code blocks are evaluated.

The command ```quarto render mcmc_diagnostics_pystan2.qmd --to html,pdf --execute-daemon-restart``` does work, which is sufficient for my immediate needs and I'm happy to assume blame for the awkward behavior on problems with my code.  I am happy, however, to try anything that might help isolate the awkward behavior above if there is interest, especially given the nontrivial dependences in the notebook.

Thanks, again and apologies for any confusion.
cscheid commented 1 year ago

Thank you for the added context! I'm probably only going to be able to get to this issue in a few weeks, but I'll follow up with you when I do (it might have to do with details of how pystan spawns subprocesses.)

cscheid commented 1 year ago

@betanalpha we recently made a (small) change on how we spawn subprocesses in the jupyter engine, and there's a small chance it fixes this issue. Would you mind trying again on https://github.com/quarto-dev/quarto-cli/releases/tag/v1.3.191 or later?

betanalpha commented 1 year ago

Apologies for the delay!

I downloaded v1.3.191 and gave it try. The results are a bit different but now it seems like neither approach works.

Running quarto render mcmc_diagnostics_pystan2.qmd --to html,pdf without execute: daemon: false in the YAML renders the HTML but then fails with a RuntimeError: context has already been set when trying to render the PDF as the daemon where the multiprocessing environment has already been configured is reused.

Previously running quarto render mcmc_diagnostics_pystan2.qmd --to html,pdf --no-execute-daemon without execute: daemon: false worked as expected but running quarto render mcmc_diagnostics_pystan2.qmd --to html,pdf with execute: daemon: false froze after a Starting python3 kernel... message while many Python processes spawned in the background.

Now running quarto render mcmc_diagnostics_pystan2.qmd --to html,pdf --no-execute-daemon without execute: daemon: false and running quarto render mcmc_diagnostics_pystan2.qmd --to html,pdf with execute: daemon: false give consistent results, but unfortunately not the expected results. Both freeze after the Starting python3 kernel... message and this time no Python processes seem to spawn in either case.

Happy to try on a more recent tag if that might help.

Thanks!

cscheid commented 1 year ago

I'm sorry this still isn't working. Do you mind trying something a bit different? I want to test whether there's something strange happening with nbclient, the Python library we use to execute jupyter notebooks. Two steps:

  1. quarto convert mcmc_diagnostics_pystan2.qmd. This should produce mcmc_diagnostics_pystan2.ipynb
  2. ./nbclient_test.py mcmc_diagnostics_pystan2.ipynb out.ipynb. nbclient_test.py is the following Python script:
#!/usr/bin/env python3

import nbformat
import sys
from nbclient import NotebookClient
import atexit

notebook_filename = sys.argv[1]
nb = nbformat.read(notebook_filename, as_version=4)
client = NotebookClient(nb, timeout=600, kernel_name='python3')

client.create_kernel_manager()
client.start_new_kernel()
client.start_new_kernel_client()
atexit.register(client._cleanup_kernel)

for index, cell in enumerate(client.nb.cells):
  if cell.cell_type == 'code':
    cell = client.execute_cell(
        cell = cell,
        cell_index = index)
    client.nb.cells[index] = cell
nbformat.write(client.nb, sys.argv[2])

I'm hoping to isolate the behavior of nbclient from the rest of quarto.

Thanks for the continued patience!

cscheid commented 1 year ago

Progress -

I managed to successfully reproduce part of this issue on a different machine. I also need to run quarto render --no-execute-daemon to avoid the context has already been set error you're seeing as well.

But the wrinkle here is that I see (eventually) a Python crash rather than a hang. Here's the stack trace for the threads, as reported by macOS:

Process:               Python [9488]
Path:                  /Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.8/Resources/Python.app/Contents/MacOS/Python
Identifier:            Python
Version:               3.8.9 (3.8.9)
Build Info:            python3-103000000000000~1538
Code Type:             X86-64 (Native)
Parent Process:        Python [9483]
Responsible:           iTerm2 [517]
User ID:               501

Date/Time:             2023-03-02 11:17:25.090 -0700
OS Version:            macOS 11.7.1 (20G918)
Report Version:        12
Bridge OS Version:     7.0 (20P411)
Anonymous UUID:        00BE91D2-23D1-4936-8626-46A46AEB60E7

Sleep/Wake UUID:       6B76A35A-DB26-4177-A99B-2E2EA9279917

Time Awake Since Boot: 230000 seconds
Time Since Wake:       140000 seconds

System Integrity Protection: enabled

Crashed Thread:        0  Dispatch queue: com.apple.main-thread

Exception Type:        EXC_BAD_ACCESS (SIGBUS)
Exception Codes:       KERN_PROTECTION_FAILURE at 0x0000000125d29c30
Exception Note:        EXC_CORPSE_NOTIFY

Termination Signal:    Bus error: 10
Termination Reason:    Namespace SIGNAL, Code 0xa
Terminating Process:   exc handler [9488]

VM Regions Near 0x125d29c30:
    __LINKEDIT                  125ce1000-125ce4000    [   12K] rw-/rwx SM=NUL  /Users/*/*.so
--> __TEXT                      125ce4000-125d34000    [  320K] r-x/rwx SM=COW  /Users/*/*.so
    __DATA                      125d34000-125d38000    [   16K] rw-/rwx SM=COW  /Users/*/*.so

Application Specific Information:
/var/folders/nc/2119qy0d3cgb18yy6cf4jtc80000gn/T/tmp1xa6f7pr/stanfit4anon_model_93cbb3ef9603e6c8a2b4af9368d57faa_5237658175436678736.cpython-38-darwin.so

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   stanfit4anon_model_93cbb3ef9603e6c8a2b4af9368d57faa_5237658175436678736.cpython-38-darwin.so    0x0000000127a37bd7 long double boost::math::detail::bessel_j0<long double>(long double) + 1239
1   stanfit4anon_model_93cbb3ef9603e6c8a2b4af9368d57faa_5237658175436678736.cpython-38-darwin.so    0x0000000127a38d03 long double boost::math::detail::bessel_y0<long double, boost::math::policies::policy<boost::math::policies::promote_float<false>, boost::math::policies::promote_double<false>, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy> >(long double, boost::math::policies::policy<boost::math::policies::promote_float<false>, boost::math::policies::promote_double<false>, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy> const&) + 739
2   stanfit4anon_model_93cbb3ef9603e6c8a2b4af9368d57faa_5237658175436678736.cpython-38-darwin.so    0x0000000127a3d9e3 __cxx_global_var_init.35 + 35
3   dyld                              0x0000000114252b47 ImageLoaderMachO::doModInitFunctions(ImageLoader::LinkContext const&) + 535
4   dyld                              0x0000000114252f52 ImageLoaderMachO::doInitialization(ImageLoader::LinkContext const&) + 40
5   dyld                              0x000000011424dae6 ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, char const*, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 492
6   dyld                              0x000000011424b89f ImageLoader::processInitializers(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 191
7   dyld                              0x000000011424b940 ImageLoader::runInitializers(ImageLoader::LinkContext const&, ImageLoader::InitializerTimingList&) + 82
8   dyld                              0x000000011423ba12 dyld::runInitializers(ImageLoader*) + 82
9   dyld                              0x000000011424711a dlopen_internal + 616
10  libdyld.dylib                     0x00007fff20969c94 dlopen_internal(char const*, int, void*) + 185
11  libdyld.dylib                     0x00007fff2095807e dlopen + 28
12  com.apple.python3                 0x000000010ad04fbd _PyImport_FindSharedFuncptr + 317
13  com.apple.python3                 0x000000010acd002a _PyImport_LoadDynamicModuleWithSpec + 570
14  com.apple.python3                 0x000000010accf8f7 _imp_create_dynamic + 343
15  com.apple.python3                 0x000000010ac01552 cfunction_vectorcall_FASTCALL + 178
16  com.apple.python3                 0x000000010abc0d64 PyVectorcall_Call + 100
17  com.apple.python3                 0x000000010aca0a43 _PyEval_EvalFrameDefault + 30419
18  com.apple.python3                 0x000000010aca48eb _PyEval_EvalCodeWithName + 3163
19  com.apple.python3                 0x000000010abc150b _PyFunction_Vectorcall + 235
20  com.apple.python3                 0x000000010aca38d4 call_function + 356
21  com.apple.python3                 0x000000010aca0542 _PyEval_EvalFrameDefault + 29138
22  com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
23  com.apple.python3                 0x000000010aca38d4 call_function + 356
24  com.apple.python3                 0x000000010aca051e _PyEval_EvalFrameDefault + 29102
25  com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
26  com.apple.python3                 0x000000010aca38d4 call_function + 356
27  com.apple.python3                 0x000000010aca05eb _PyEval_EvalFrameDefault + 29307
28  com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
29  com.apple.python3                 0x000000010aca38d4 call_function + 356
30  com.apple.python3                 0x000000010aca05eb _PyEval_EvalFrameDefault + 29307
31  com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
32  com.apple.python3                 0x000000010aca38d4 call_function + 356
33  com.apple.python3                 0x000000010aca05eb _PyEval_EvalFrameDefault + 29307
34  com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
35  com.apple.python3                 0x000000010abc2d5a object_vacall + 346
36  com.apple.python3                 0x000000010abc2f83 _PyObject_CallMethodIdObjArgs + 227
37  com.apple.python3                 0x000000010acce783 PyImport_ImportModuleLevelObject + 1795
38  com.apple.python3                 0x000000010ac95749 builtin___import__ + 137
39  com.apple.python3                 0x000000010abc0fcb cfunction_call_varargs + 123
40  com.apple.python3                 0x000000010abc0a49 _PyObject_MakeTpCall + 377
41  com.apple.python3                 0x000000010aca3a00 call_function + 656
42  com.apple.python3                 0x000000010aca05eb _PyEval_EvalFrameDefault + 29307
43  com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
44  com.apple.python3                 0x000000010aca38d4 call_function + 356
45  com.apple.python3                 0x000000010aca05eb _PyEval_EvalFrameDefault + 29307
46  com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
47  com.apple.python3                 0x000000010abc3e1f method_vectorcall + 463
48  com.apple.python3                 0x000000010abc2d5a object_vacall + 346
49  com.apple.python3                 0x000000010abc3098 PyObject_CallFunctionObjArgs + 152
50  _pickle.cpython-38-darwin.so      0x000000010b69949b load + 3691
51  _pickle.cpython-38-darwin.so      0x000000010b6903cf _pickle_load + 671
52  com.apple.python3                 0x000000010ac01623 cfunction_vectorcall_FASTCALL_KEYWORDS + 131
53  com.apple.python3                 0x000000010aca38d4 call_function + 356
54  com.apple.python3                 0x000000010aca0542 _PyEval_EvalFrameDefault + 29138
55  com.apple.python3                 0x000000010aca48eb _PyEval_EvalCodeWithName + 3163
56  com.apple.python3                 0x000000010abc150b _PyFunction_Vectorcall + 235
57  com.apple.python3                 0x000000010aca38d4 call_function + 356
58  com.apple.python3                 0x000000010aca05eb _PyEval_EvalFrameDefault + 29307
59  com.apple.python3                 0x000000010aca48eb _PyEval_EvalCodeWithName + 3163
60  com.apple.python3                 0x000000010ac992b3 PyEval_EvalCode + 51
61  com.apple.python3                 0x000000010ac9667b builtin_exec + 619
62  com.apple.python3                 0x000000010ac01552 cfunction_vectorcall_FASTCALL + 178
63  com.apple.python3                 0x000000010aca38d4 call_function + 356
64  com.apple.python3                 0x000000010aca05eb _PyEval_EvalFrameDefault + 29307
65  com.apple.python3                 0x000000010abd1ea4 gen_send_ex + 244
66  com.apple.python3                 0x000000010ac9be88 _PyEval_EvalFrameDefault + 11032
67  com.apple.python3                 0x000000010abd1ea4 gen_send_ex + 244
68  com.apple.python3                 0x000000010ac9be88 _PyEval_EvalFrameDefault + 11032
69  com.apple.python3                 0x000000010abd1ea4 gen_send_ex + 244
70  com.apple.python3                 0x000000010abca795 method_vectorcall_O + 245
71  com.apple.python3                 0x000000010aca38d4 call_function + 356
72  com.apple.python3                 0x000000010aca051e _PyEval_EvalFrameDefault + 29102
73  com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
74  com.apple.python3                 0x000000010aca38d4 call_function + 356
75  com.apple.python3                 0x000000010aca05eb _PyEval_EvalFrameDefault + 29307
76  com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
77  com.apple.python3                 0x000000010aca38d4 call_function + 356
78  com.apple.python3                 0x000000010aca051e _PyEval_EvalFrameDefault + 29102
79  com.apple.python3                 0x000000010aca48eb _PyEval_EvalCodeWithName + 3163
80  com.apple.python3                 0x000000010abc150b _PyFunction_Vectorcall + 235
81  com.apple.python3                 0x000000010abc3e1f method_vectorcall + 463
82  com.apple.python3                 0x000000010abc0d64 PyVectorcall_Call + 100
83  com.apple.python3                 0x000000010aca0864 _PyEval_EvalFrameDefault + 29940
84  com.apple.python3                 0x000000010aca48eb _PyEval_EvalCodeWithName + 3163
85  com.apple.python3                 0x000000010abc150b _PyFunction_Vectorcall + 235
86  com.apple.python3                 0x000000010abc3cfc method_vectorcall + 172
87  com.apple.python3                 0x000000010aca38d4 call_function + 356
88  com.apple.python3                 0x000000010aca06a2 _PyEval_EvalFrameDefault + 29490
89  com.apple.python3                 0x000000010abd1ea4 gen_send_ex + 244
90  com.apple.python3                 0x000000010ac9be88 _PyEval_EvalFrameDefault + 11032
91  com.apple.python3                 0x000000010abd1ea4 gen_send_ex + 244
92  com.apple.python3                 0x000000010ac9be88 _PyEval_EvalFrameDefault + 11032
93  com.apple.python3                 0x000000010abd1ea4 gen_send_ex + 244
94  com.apple.python3                 0x000000010ac9be88 _PyEval_EvalFrameDefault + 11032
95  com.apple.python3                 0x000000010abd1ea4 gen_send_ex + 244
96  com.apple.python3                 0x000000010ac9be88 _PyEval_EvalFrameDefault + 11032
97  com.apple.python3                 0x000000010abd1ea4 gen_send_ex + 244
98  _asyncio.cpython-38-darwin.so     0x000000010b980275 task_step + 677
99  _asyncio.cpython-38-darwin.so     0x000000010b9810d2 TaskWakeupMethWrapper_call + 434
100 com.apple.python3                 0x000000010abc0a49 _PyObject_MakeTpCall + 377
101 com.apple.python3                 0x000000010acbd703 context_run + 243
102 com.apple.python3                 0x000000010ac01623 cfunction_vectorcall_FASTCALL_KEYWORDS + 131
103 com.apple.python3                 0x000000010abc0d64 PyVectorcall_Call + 100
104 com.apple.python3                 0x000000010aca0a43 _PyEval_EvalFrameDefault + 30419
105 com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
106 com.apple.python3                 0x000000010aca38d4 call_function + 356
107 com.apple.python3                 0x000000010aca051e _PyEval_EvalFrameDefault + 29102
108 com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
109 com.apple.python3                 0x000000010aca38d4 call_function + 356
110 com.apple.python3                 0x000000010aca051e _PyEval_EvalFrameDefault + 29102
111 com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
112 com.apple.python3                 0x000000010aca38d4 call_function + 356
113 com.apple.python3                 0x000000010aca051e _PyEval_EvalFrameDefault + 29102
114 com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
115 com.apple.python3                 0x000000010aca38d4 call_function + 356
116 com.apple.python3                 0x000000010aca051e _PyEval_EvalFrameDefault + 29102
117 com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
118 com.apple.python3                 0x000000010aca38d4 call_function + 356
119 com.apple.python3                 0x000000010aca051e _PyEval_EvalFrameDefault + 29102
120 com.apple.python3                 0x000000010aca48eb _PyEval_EvalCodeWithName + 3163
121 com.apple.python3                 0x000000010abc150b _PyFunction_Vectorcall + 235
122 com.apple.python3                 0x000000010abc3cfc method_vectorcall + 172
123 com.apple.python3                 0x000000010aca38d4 call_function + 356
124 com.apple.python3                 0x000000010aca0542 _PyEval_EvalFrameDefault + 29138
125 com.apple.python3                 0x000000010aca48eb _PyEval_EvalCodeWithName + 3163
126 com.apple.python3                 0x000000010ac992b3 PyEval_EvalCode + 51
127 com.apple.python3                 0x000000010ac9667b builtin_exec + 619
128 com.apple.python3                 0x000000010ac01552 cfunction_vectorcall_FASTCALL + 178
129 com.apple.python3                 0x000000010aca38d4 call_function + 356
130 com.apple.python3                 0x000000010aca05eb _PyEval_EvalFrameDefault + 29307
131 com.apple.python3                 0x000000010aca48eb _PyEval_EvalCodeWithName + 3163
132 com.apple.python3                 0x000000010abc150b _PyFunction_Vectorcall + 235
133 com.apple.python3                 0x000000010aca38d4 call_function + 356
134 com.apple.python3                 0x000000010aca05eb _PyEval_EvalFrameDefault + 29307
135 com.apple.python3                 0x000000010aca48eb _PyEval_EvalCodeWithName + 3163
136 com.apple.python3                 0x000000010abc150b _PyFunction_Vectorcall + 235
137 com.apple.python3                 0x000000010abc0d64 PyVectorcall_Call + 100
138 com.apple.python3                 0x000000010ad09100 pymain_run_module + 192
139 com.apple.python3                 0x000000010ad088da Py_RunMain + 1466
140 com.apple.python3                 0x000000010ad08fcf pymain_main + 335
141 com.apple.python3                 0x000000010ad0902b Py_BytesMain + 43
142 libdyld.dylib                     0x00007fff20967f3d start + 1

Thread 1:: ZMQbg/Reaper
0   libsystem_kernel.dylib            0x00007fff2091bc3a kevent + 10
1   libzmq.5.dylib                    0x000000010b3958f6 zmq::kqueue_t::loop() + 278
2   libzmq.5.dylib                    0x000000010b3c3a59 zmq::worker_poller_base_t::worker_routine(void*) + 25
3   libzmq.5.dylib                    0x000000010b40a54c thread_routine(void*) + 300
4   libsystem_pthread.dylib           0x00007fff2094c8fc _pthread_start + 224
5   libsystem_pthread.dylib           0x00007fff20948443 thread_start + 15

Thread 2:: ZMQbg/IO/0
0   libsystem_kernel.dylib            0x00007fff2091bc3a kevent + 10
1   libzmq.5.dylib                    0x000000010b3958f6 zmq::kqueue_t::loop() + 278
2   libzmq.5.dylib                    0x000000010b3c3a59 zmq::worker_poller_base_t::worker_routine(void*) + 25
3   libzmq.5.dylib                    0x000000010b40a54c thread_routine(void*) + 300
4   libsystem_pthread.dylib           0x00007fff2094c8fc _pthread_start + 224
5   libsystem_pthread.dylib           0x00007fff20948443 thread_start + 15

Thread 3:
0   libsystem_kernel.dylib            0x00007fff2091bc3a kevent + 10
1   select.cpython-38-darwin.so       0x000000010b2933a6 select_kqueue_control + 918
2   com.apple.python3                 0x000000010abca30c method_vectorcall_FASTCALL + 252
3   com.apple.python3                 0x000000010aca38d4 call_function + 356
4   com.apple.python3                 0x000000010aca051e _PyEval_EvalFrameDefault + 29102
5   com.apple.python3                 0x000000010aca48eb _PyEval_EvalCodeWithName + 3163
6   com.apple.python3                 0x000000010abc150b _PyFunction_Vectorcall + 235
7   com.apple.python3                 0x000000010aca38d4 call_function + 356
8   com.apple.python3                 0x000000010aca051e _PyEval_EvalFrameDefault + 29102
9   com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
10  com.apple.python3                 0x000000010aca38d4 call_function + 356
11  com.apple.python3                 0x000000010aca051e _PyEval_EvalFrameDefault + 29102
12  com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
13  com.apple.python3                 0x000000010aca38d4 call_function + 356
14  com.apple.python3                 0x000000010aca051e _PyEval_EvalFrameDefault + 29102
15  com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
16  com.apple.python3                 0x000000010aca38d4 call_function + 356
17  com.apple.python3                 0x000000010aca051e _PyEval_EvalFrameDefault + 29102
18  com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
19  com.apple.python3                 0x000000010abc3d93 method_vectorcall + 323
20  com.apple.python3                 0x000000010abc0d64 PyVectorcall_Call + 100
21  com.apple.python3                 0x000000010aca0864 _PyEval_EvalFrameDefault + 29940
22  com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
23  com.apple.python3                 0x000000010aca38d4 call_function + 356
24  com.apple.python3                 0x000000010aca051e _PyEval_EvalFrameDefault + 29102
25  com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
26  com.apple.python3                 0x000000010aca38d4 call_function + 356
27  com.apple.python3                 0x000000010aca051e _PyEval_EvalFrameDefault + 29102
28  com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
29  com.apple.python3                 0x000000010abc3d93 method_vectorcall + 323
30  com.apple.python3                 0x000000010abc0d64 PyVectorcall_Call + 100
31  com.apple.python3                 0x000000010ad45d5b t_bootstrap + 75
32  com.apple.python3                 0x000000010acf6b59 pythread_wrapper + 25
33  libsystem_pthread.dylib           0x00007fff2094c8fc _pthread_start + 224
34  libsystem_pthread.dylib           0x00007fff20948443 thread_start + 15

Thread 4:
0   libsystem_kernel.dylib            0x00007fff2091d9ba poll + 10
1   libzmq.5.dylib                    0x000000010b430a63 zmq_poll + 835
2   libzmq.5.dylib                    0x000000010b3c5e3f zmq::proxy(zmq::socket_base_t*, zmq::socket_base_t*, zmq::socket_base_t*, zmq::socket_base_t*) + 383
3   libzmq.5.dylib                    0x000000010b431ca6 zmq_proxy + 86
4   _device.cpython-38-darwin.so      0x000000010b346644 __pyx_pw_3zmq_7backend_6cython_7_device_3proxy + 356
5   _device.cpython-38-darwin.so      0x000000010b346066 __Pyx_PyObject_Call + 86
6   _device.cpython-38-darwin.so      0x000000010b345424 __pyx_pw_3zmq_7backend_6cython_7_device_1device + 420
7   com.apple.python3                 0x000000010abc0a49 _PyObject_MakeTpCall + 377
8   com.apple.python3                 0x000000010aca3a00 call_function + 656
9   com.apple.python3                 0x000000010aca0542 _PyEval_EvalFrameDefault + 29138
10  com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
11  com.apple.python3                 0x000000010aca38d4 call_function + 356
12  com.apple.python3                 0x000000010aca051e _PyEval_EvalFrameDefault + 29102
13  com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
14  com.apple.python3                 0x000000010aca38d4 call_function + 356
15  com.apple.python3                 0x000000010aca051e _PyEval_EvalFrameDefault + 29102
16  com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
17  com.apple.python3                 0x000000010abc3d93 method_vectorcall + 323
18  com.apple.python3                 0x000000010abc0d64 PyVectorcall_Call + 100
19  com.apple.python3                 0x000000010ad45d5b t_bootstrap + 75
20  com.apple.python3                 0x000000010acf6b59 pythread_wrapper + 25
21  libsystem_pthread.dylib           0x00007fff2094c8fc _pthread_start + 224
22  libsystem_pthread.dylib           0x00007fff20948443 thread_start + 15

Thread 5:: ZMQbg/Reaper
0   libsystem_kernel.dylib            0x00007fff2091bc3a kevent + 10
1   libzmq.5.dylib                    0x000000010b3958f6 zmq::kqueue_t::loop() + 278
2   libzmq.5.dylib                    0x000000010b3c3a59 zmq::worker_poller_base_t::worker_routine(void*) + 25
3   libzmq.5.dylib                    0x000000010b40a54c thread_routine(void*) + 300
4   libsystem_pthread.dylib           0x00007fff2094c8fc _pthread_start + 224
5   libsystem_pthread.dylib           0x00007fff20948443 thread_start + 15

Thread 6:: ZMQbg/IO/0
0   libsystem_kernel.dylib            0x00007fff2091bc3a kevent + 10
1   libzmq.5.dylib                    0x000000010b3958f6 zmq::kqueue_t::loop() + 278
2   libzmq.5.dylib                    0x000000010b3c3a59 zmq::worker_poller_base_t::worker_routine(void*) + 25
3   libzmq.5.dylib                    0x000000010b40a54c thread_routine(void*) + 300
4   libsystem_pthread.dylib           0x00007fff2094c8fc _pthread_start + 224
5   libsystem_pthread.dylib           0x00007fff20948443 thread_start + 15

Thread 7:
0   libsystem_kernel.dylib            0x00007fff20917cbe read + 10
1   com.apple.python3                 0x000000010ad046a2 _Py_read + 82
2   com.apple.python3                 0x000000010ad15a25 os_read + 309
3   com.apple.python3                 0x000000010ac01552 cfunction_vectorcall_FASTCALL + 178
4   com.apple.python3                 0x000000010aca38d4 call_function + 356
5   com.apple.python3                 0x000000010aca0542 _PyEval_EvalFrameDefault + 29138
6   com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
7   com.apple.python3                 0x000000010abc3d93 method_vectorcall + 323
8   com.apple.python3                 0x000000010abc0d64 PyVectorcall_Call + 100
9   com.apple.python3                 0x000000010aca0864 _PyEval_EvalFrameDefault + 29940
10  com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
11  com.apple.python3                 0x000000010aca38d4 call_function + 356
12  com.apple.python3                 0x000000010aca051e _PyEval_EvalFrameDefault + 29102
13  com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
14  com.apple.python3                 0x000000010aca38d4 call_function + 356
15  com.apple.python3                 0x000000010aca051e _PyEval_EvalFrameDefault + 29102
16  com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
17  com.apple.python3                 0x000000010abc3d93 method_vectorcall + 323
18  com.apple.python3                 0x000000010abc0d64 PyVectorcall_Call + 100
19  com.apple.python3                 0x000000010ad45d5b t_bootstrap + 75
20  com.apple.python3                 0x000000010acf6b59 pythread_wrapper + 25
21  libsystem_pthread.dylib           0x00007fff2094c8fc _pthread_start + 224
22  libsystem_pthread.dylib           0x00007fff20948443 thread_start + 15

Thread 8:
0   libsystem_kernel.dylib            0x00007fff20917cbe read + 10
1   com.apple.python3                 0x000000010ad046a2 _Py_read + 82
2   com.apple.python3                 0x000000010ad15a25 os_read + 309
3   com.apple.python3                 0x000000010ac01552 cfunction_vectorcall_FASTCALL + 178
4   com.apple.python3                 0x000000010aca38d4 call_function + 356
5   com.apple.python3                 0x000000010aca0542 _PyEval_EvalFrameDefault + 29138
6   com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
7   com.apple.python3                 0x000000010abc3d93 method_vectorcall + 323
8   com.apple.python3                 0x000000010abc0d64 PyVectorcall_Call + 100
9   com.apple.python3                 0x000000010aca0864 _PyEval_EvalFrameDefault + 29940
10  com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
11  com.apple.python3                 0x000000010aca38d4 call_function + 356
12  com.apple.python3                 0x000000010aca051e _PyEval_EvalFrameDefault + 29102
13  com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
14  com.apple.python3                 0x000000010aca38d4 call_function + 356
15  com.apple.python3                 0x000000010aca051e _PyEval_EvalFrameDefault + 29102
16  com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
17  com.apple.python3                 0x000000010abc3d93 method_vectorcall + 323
18  com.apple.python3                 0x000000010abc0d64 PyVectorcall_Call + 100
19  com.apple.python3                 0x000000010ad45d5b t_bootstrap + 75
20  com.apple.python3                 0x000000010acf6b59 pythread_wrapper + 25
21  libsystem_pthread.dylib           0x00007fff2094c8fc _pthread_start + 224
22  libsystem_pthread.dylib           0x00007fff20948443 thread_start + 15

Thread 9:
0   libsystem_kernel.dylib            0x00007fff2091bc3a kevent + 10
1   select.cpython-38-darwin.so       0x000000010b29343d select_kqueue_control + 1069
2   com.apple.python3                 0x000000010abca30c method_vectorcall_FASTCALL + 252
3   com.apple.python3                 0x000000010aca38d4 call_function + 356
4   com.apple.python3                 0x000000010aca051e _PyEval_EvalFrameDefault + 29102
5   com.apple.python3                 0x000000010aca48eb _PyEval_EvalCodeWithName + 3163
6   com.apple.python3                 0x000000010abc150b _PyFunction_Vectorcall + 235
7   com.apple.python3                 0x000000010aca38d4 call_function + 356
8   com.apple.python3                 0x000000010aca051e _PyEval_EvalFrameDefault + 29102
9   com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
10  com.apple.python3                 0x000000010aca38d4 call_function + 356
11  com.apple.python3                 0x000000010aca051e _PyEval_EvalFrameDefault + 29102
12  com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
13  com.apple.python3                 0x000000010aca38d4 call_function + 356
14  com.apple.python3                 0x000000010aca051e _PyEval_EvalFrameDefault + 29102
15  com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
16  com.apple.python3                 0x000000010aca38d4 call_function + 356
17  com.apple.python3                 0x000000010aca051e _PyEval_EvalFrameDefault + 29102
18  com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
19  com.apple.python3                 0x000000010aca38d4 call_function + 356
20  com.apple.python3                 0x000000010aca051e _PyEval_EvalFrameDefault + 29102
21  com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
22  com.apple.python3                 0x000000010aca38d4 call_function + 356
23  com.apple.python3                 0x000000010aca051e _PyEval_EvalFrameDefault + 29102
24  com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
25  com.apple.python3                 0x000000010abc3d93 method_vectorcall + 323
26  com.apple.python3                 0x000000010abc0d64 PyVectorcall_Call + 100
27  com.apple.python3                 0x000000010ad45d5b t_bootstrap + 75
28  com.apple.python3                 0x000000010acf6b59 pythread_wrapper + 25
29  libsystem_pthread.dylib           0x00007fff2094c8fc _pthread_start + 224
30  libsystem_pthread.dylib           0x00007fff20948443 thread_start + 15

Thread 10:
0   libsystem_pthread.dylib           0x00007fff20948420 start_wqthread + 0

Thread 11:
0   libsystem_kernel.dylib            0x00007fff2091f646 __select + 10
1   com.apple.python3                 0x000000010ad4362c time_sleep + 124
2   com.apple.python3                 0x000000010ac01877 cfunction_vectorcall_O + 215
3   com.apple.python3                 0x000000010aca38d4 call_function + 356
4   com.apple.python3                 0x000000010aca0542 _PyEval_EvalFrameDefault + 29138
5   com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
6   com.apple.python3                 0x000000010aca38d4 call_function + 356
7   com.apple.python3                 0x000000010aca051e _PyEval_EvalFrameDefault + 29102
8   com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
9   com.apple.python3                 0x000000010aca38d4 call_function + 356
10  com.apple.python3                 0x000000010aca051e _PyEval_EvalFrameDefault + 29102
11  com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
12  com.apple.python3                 0x000000010abc3d93 method_vectorcall + 323
13  com.apple.python3                 0x000000010abc0d64 PyVectorcall_Call + 100
14  com.apple.python3                 0x000000010ad45d5b t_bootstrap + 75
15  com.apple.python3                 0x000000010acf6b59 pythread_wrapper + 25
16  libsystem_pthread.dylib           0x00007fff2094c8fc _pthread_start + 224
17  libsystem_pthread.dylib           0x00007fff20948443 thread_start + 15

Thread 12:
0   libsystem_kernel.dylib            0x00007fff2091d9ba poll + 10
1   libzmq.5.dylib                    0x000000010b3e08f7 zmq::signaler_t::wait(int) const + 103
2   libzmq.5.dylib                    0x000000010b3995dd zmq::mailbox_t::recv(zmq::command_t*, int) + 109
3   libzmq.5.dylib                    0x000000010b3e41aa zmq::socket_base_t::process_commands(int, bool) + 218
4   libzmq.5.dylib                    0x000000010b3e8eed zmq::socket_base_t::recv(zmq::msg_t*, int) + 941
5   libzmq.5.dylib                    0x000000010b42fea3 s_recvmsg(zmq::socket_base_t*, zmq_msg_t*, int) + 35
6   libzmq.5.dylib                    0x000000010b42fb27 zmq_msg_recv + 71
7   socket.cpython-38-darwin.so       0x000000010b5bcb76 __pyx_f_3zmq_7backend_6cython_6socket_6Socket_recv + 342
8   socket.cpython-38-darwin.so       0x000000010b5c14a1 __pyx_pw_3zmq_7backend_6cython_6socket_6Socket_27recv + 305
9   _device.cpython-38-darwin.so      0x000000010b34463b __Pyx_CyFunction_CallAsMethod + 91
10  com.apple.python3                 0x000000010abc0a49 _PyObject_MakeTpCall + 377
11  com.apple.python3                 0x000000010abc3d36 method_vectorcall + 230
12  com.apple.python3                 0x000000010aca38d4 call_function + 356
13  com.apple.python3                 0x000000010aca0542 _PyEval_EvalFrameDefault + 29138
14  com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
15  com.apple.python3                 0x000000010aca38d4 call_function + 356
16  com.apple.python3                 0x000000010aca051e _PyEval_EvalFrameDefault + 29102
17  com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
18  com.apple.python3                 0x000000010aca38d4 call_function + 356
19  com.apple.python3                 0x000000010aca051e _PyEval_EvalFrameDefault + 29102
20  com.apple.python3                 0x000000010abc139b function_code_fastcall + 171
21  com.apple.python3                 0x000000010abc3d93 method_vectorcall + 323
22  com.apple.python3                 0x000000010abc0d64 PyVectorcall_Call + 100
23  com.apple.python3                 0x000000010ad45d5b t_bootstrap + 75
24  com.apple.python3                 0x000000010acf6b59 pythread_wrapper + 25
25  libsystem_pthread.dylib           0x00007fff2094c8fc _pthread_start + 224
26  libsystem_pthread.dylib           0x00007fff20948443 thread_start + 15

Thread 13:: ZMQbg/Reaper
0   libsystem_kernel.dylib            0x00007fff2091bc3a kevent + 10
1   libzmq.5.dylib                    0x000000010b3958f6 zmq::kqueue_t::loop() + 278
2   libzmq.5.dylib                    0x000000010b3c3a59 zmq::worker_poller_base_t::worker_routine(void*) + 25
3   libzmq.5.dylib                    0x000000010b40a54c thread_routine(void*) + 300
4   libsystem_pthread.dylib           0x00007fff2094c8fc _pthread_start + 224
5   libsystem_pthread.dylib           0x00007fff20948443 thread_start + 15

Thread 14:: ZMQbg/IO/0
0   libsystem_kernel.dylib            0x00007fff2091bc3a kevent + 10
1   libzmq.5.dylib                    0x000000010b3958f6 zmq::kqueue_t::loop() + 278
2   libzmq.5.dylib                    0x000000010b3c3a59 zmq::worker_poller_base_t::worker_routine(void*) + 25
3   libzmq.5.dylib                    0x000000010b40a54c thread_routine(void*) + 300
4   libsystem_pthread.dylib           0x00007fff2094c8fc _pthread_start + 224
5   libsystem_pthread.dylib           0x00007fff20948443 thread_start + 15

Thread 15:
0   libsystem_kernel.dylib            0x00007fff20919cce __psynch_cvwait + 10
1   libsystem_pthread.dylib           0x00007fff2094ce49 _pthread_cond_wait + 1298
2   libopenblas.0.dylib               0x00000001172ce70f blas_thread_server + 207
3   libsystem_pthread.dylib           0x00007fff2094c8fc _pthread_start + 224
4   libsystem_pthread.dylib           0x00007fff20948443 thread_start + 15

Thread 16:
0   libsystem_kernel.dylib            0x00007fff20919cce __psynch_cvwait + 10
1   libsystem_pthread.dylib           0x00007fff2094ce49 _pthread_cond_wait + 1298
2   libopenblas.0.dylib               0x00000001172ce70f blas_thread_server + 207
3   libsystem_pthread.dylib           0x00007fff2094c8fc _pthread_start + 224
4   libsystem_pthread.dylib           0x00007fff20948443 thread_start + 15

Thread 17:
0   libsystem_kernel.dylib            0x00007fff20919cce __psynch_cvwait + 10
1   libsystem_pthread.dylib           0x00007fff2094ce49 _pthread_cond_wait + 1298
2   libopenblas.0.dylib               0x00000001172ce70f blas_thread_server + 207
3   libsystem_pthread.dylib           0x00007fff2094c8fc _pthread_start + 224
4   libsystem_pthread.dylib           0x00007fff20948443 thread_start + 15

Thread 18:
0   libsystem_kernel.dylib            0x00007fff20919cce __psynch_cvwait + 10
1   libsystem_pthread.dylib           0x00007fff2094ce49 _pthread_cond_wait + 1298
2   libopenblas.0.dylib               0x00000001172ce70f blas_thread_server + 207
3   libsystem_pthread.dylib           0x00007fff2094c8fc _pthread_start + 224
4   libsystem_pthread.dylib           0x00007fff20948443 thread_start + 15

Thread 19:
0   libsystem_kernel.dylib            0x00007fff20919cce __psynch_cvwait + 10
1   libsystem_pthread.dylib           0x00007fff2094ce49 _pthread_cond_wait + 1298
2   libopenblas.0.dylib               0x00000001172ce70f blas_thread_server + 207
3   libsystem_pthread.dylib           0x00007fff2094c8fc _pthread_start + 224
4   libsystem_pthread.dylib           0x00007fff20948443 thread_start + 15

Is this at all helpful in your case?

betanalpha commented 1 year ago

Running nbclient_test.py on the ipynb generated by quarto does run successfully. I even ran it twice in quick succession to mimic rendering to multiple outputs and potentially catch a stale daemon and it seemed to work just fine.

To avoid any confusion let me repeat that when I render with --to html,pdf the first pass to generate the HTML works fine. Rather it's the second pass to generate the PDF that presumably picks up the existing daemon and its stale environment that causes problems.

Was the crash immediate or did it hang for a bit before crashing? I was waiting only for about a minute before killing the processes myself so it might have hung eventually. Also I set multiprocessing.set_start_method("fork") which may change the failure mode for threading problems?

Happy to run any and all tests/experiments that might be useful! Quarto has been a lovely experience overall and I'm happy to help however I can.

dragonstyle commented 10 months ago

Just leaving a note that I spent a little time trying to dig further into this issue. I can verify that basic daemon operations appear functional and unchanged in Quarto. I can further verify that simple examples of multiprocessing like so:

---
title: Hello
format: html
---

## Hello

```{python}
import multiprocessing
from multiprocessing.pool import ThreadPool as Pool

def worker(p):
    """worker function"""
    print('Worker\n')
    return

pool = Pool(4)
for result in pool.map(worker, range(5)):
    pass    # or print diagnostics

work fine in all deamon modes. 

When setting the start method, I did need to use either `no-execute-daemon` or `execute-daemon-restart` to avoid attempting to reinitialize the context, but either worked for this simple case:

title: Hello format: html

Hello

import multiprocessing
multiprocessing.set_start_method("fork")
print("Number of cpu : ", multiprocessing.cpu_count())


No net new information here, unfortunately, but wanted to leave a marker noting where things stand. 
betanalpha commented 7 months ago

Apologies for the extremely long delay.

Firstly I want to make it clear that I think that the current daemon functionality is reasonable, especially with the updated quarto documentation.

Moreover everything works fine for most Python applications, including most PyStan2 applications, that I've been able to try. From what I can understand from my experiments the exceptions may have to do with stdout or stderr getting overloaded. In particular the code that hangs for me involves a Stan program that emits a lot more messages than is typical, and I know that passing the stdout and stderr buffers from Cython to a jupyter kernel can sometime be wonky. Unfortunately this hypothesis has been difficult to verify due to my not having code where I can configure the messages directly.

That said this seems to have gone beyond the scope of the original issue and I'm happy to close, or have the issue closed. Let me know if that would be preferable.

Thanks again for all of the help!

cscheid commented 7 months ago

Let's leave this open and edit the title so that it's more likely to be relevant for other folks (I imagine you're not the only one running pystan2 in quarto!)

betanalpha commented 7 months ago

Let me know if there's anything I can do to assist in further investigations. Thanks!