Parsl / parsl

Parsl - a Python parallel scripting library
http://parsl-project.org
Apache License 2.0
501 stars 195 forks source link

How to debug lost worker (perhaps due to forking on macOS)? #2485

Open raymondEhlers opened 1 year ago

raymondEhlers commented 1 year ago

Is your feature request related to a problem? Please describe. I've used parsl successfully on a number of clusters with slurm (thanks!) using HTEX + SlurmProvider. I recently tried to use some of this code to run some smaller tasks on my M1 macbook (using HTEX + Local Provider), and when running a particular set of tasks that work with HTEX + SlurmProvider, the worker(s) on my mac are always lost (ie. WorkerLost: Task failure due to loss of worker 0 on host ...). Since it's a python app, I can't retrieve the app log. The worker logs just say that the worker died while running a task. (This is somewhere between a question, a feature request, and a bug report.)

Describe the solution you'd like Some additional documentation on techniques to debug lost workers and/or python apps when you need the logs would be extremely helpful. (I did get a bit of info from the submit script stderr, which suggests a fork issue - see below). My usual strategy for debugging python app is just to comment out the python_app wrapper and run it immediately on my local node. However, in this case, the app works correctly, so this doesn't tell me anything, so additional suggestions on techniques would be great!

Describe alternatives you've considered I've looked through everything that I know of in the runinfo. The worker logs just say that it received the task, and then the worker died. There's some info from the stderr, but without further context of what the app was doing at that point, it's hard to determine more

Additional context

Everything appears fine in the runinfo logs until the worker dies except in the submit script stderr, which has this info:

objc[84360]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called.
objc[84360]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.

This seems to suggest some issue with forking, so I grabbed the parsl master and tried switching the start_method to spawn and thread. This error disappeared from the sderr, but both method still suffered from the same issue of the worker dying.

I managed to attach lldb to the process_worker_pool.py workers, which indicates a signal SIGABRT. The end of the trace is below. It may be overly focused on my case, but I include it here from completeness.

``` * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGABRT * frame #0: 0x00000001afe3b4d4 libsystem_kernel.dylib`__abort_with_payload + 8 frame #1: 0x00000001afe3decc libsystem_kernel.dylib`abort_with_payload_wrapper_internal + 104 frame #2: 0x00000001afe3de64 libsystem_kernel.dylib`abort_with_reason + 32 frame #3: 0x00000001afcf8b40 libobjc.A.dylib`_objc_fatalv(unsigned long long, unsigned long long, char const*, char*) + 128 frame #4: 0x00000001afcf8ac0 libobjc.A.dylib`_objc_fatal(char const*, ...) + 44 frame #5: 0x00000001afcec614 libobjc.A.dylib`performForkChildInitialize(objc_class*, objc_class*) + 400 frame #6: 0x00000001afcd37c0 libobjc.A.dylib`initializeNonMetaClass + 496 frame #7: 0x00000001afcd3250 libobjc.A.dylib`initializeAndMaybeRelock(objc_class*, objc_object*, mutex_tt&, bool) + 184 frame #8: 0x00000001afcd2fe8 libobjc.A.dylib`lookUpImpOrForward + 1052 frame #9: 0x00000001afcd28e4 libobjc.A.dylib`_objc_msgSend_uncached + 68 frame #10: 0x00000001b0d9a628 Foundation`NSClassFromString + 64 frame #11: 0x00000001b2a3fe28 AppKit`+[NSColor(NSUIKitSupport) load] + 44 frame #12: 0x00000001afcd5398 libobjc.A.dylib`load_images + 828 frame #13: 0x000000010533bba0 dyld`dyld4::RuntimeState::notifyObjCInit(dyld4::Loader const*) + 164 frame #14: 0x000000010534202c dyld`dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array&) const + 204 frame #15: 0x0000000105342014 dyld`dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array&) const + 180 frame #16: 0x0000000105342014 dyld`dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array&) const + 180 frame #17: 0x0000000105342014 dyld`dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array&) const + 180 frame #18: 0x0000000105342104 dyld`dyld4::Loader::runInitializersBottomUpPlusUpwardLinks(dyld4::RuntimeState&) const + 124 frame #19: 0x0000000105351c64 dyld`dyld4::APIs::dlopen_from(char const*, int, void*) + 520 frame #20: 0x0000000104e42bfc python3.10`_PyImport_FindSharedFuncptr + 296 frame #21: 0x0000000104dfdd50 python3.10`_imp_create_dynamic + 1108 frame #22: 0x0000000104d35a78 python3.10`cfunction_vectorcall_FASTCALL + 200 frame #23: 0x0000000104dcd100 python3.10`_PyEval_EvalFrameDefault + 27300 frame #24: 0x0000000104dc5d50 python3.10`_PyEval_Vector + 2056 frame #25: 0x0000000104dd110c python3.10`call_function + 524 frame #26: 0x0000000104dccd88 python3.10`_PyEval_EvalFrameDefault + 26412 frame #27: 0x0000000104dc5d50 python3.10`_PyEval_Vector + 2056 ```

Unfortunately, I haven't been able to find an easy reproducer (my code is far too involved for a reproducer), which is why I'm not filing this as a bug report (plus, although I suspect forking with parsl to be the issue, I cannot rule other things out definitively). I've found that simple tasks work fine, but using tasks which involve loading c++ code (mainly through lazily loaded ROOT, which in the past has always been enough to avoid issues with loading in this manner). edit: using ThreadPoolExecutor works, so that suggests something that HTEX is doing? (my workload is CPU limited, so this isn't really a viable workaround as far as I can tell)

I understand that you're unlikely to be able to solve this directly, but any suggestions on how to further understand this issue or work around it would be greatly appreciated! Thanks in advance!

Versions: parsl: I've tried with 6d1f9160d487b3265f6e9d65ebb357837a437c30, as well as 56491bc2d7909191348ebdaf9330d3f2d06845b3 (current master) macOS: 12.5.1 Monterey python: 3.10.7 via conda

benclifford commented 1 year ago

performForkChildInitialize in the stack trace makes me suspicious that this is something to do with parsl's not very clean/safe use of fork and threads.

Does the stack trace from SIGABRT look the same when using spawn?

I'd expect abort_with_reason (also in the stack trace) to give some human readable reason somewhere, because the signature looks like this:

void abort_with_reason(uint32_t reason_namespace, uint64_t reason_code, const char *reason_string, uint64_t reason_flags) __attribute__((noreturn));

That message should also get stored/logged somewhere by CRSetCrashLogMessage according to apples source code for that bit of objc*

But... I don't know where that reason gets logged/stored that is accessible to you, because I've never worked with this bit of objc before. From your report, it looks like not the stderr of the worker process.

Can you paste in the full complete output of the stack trace? (preferably using spawn because that's the cleanest for parsl, I think)

raymondEhlers commented 1 year ago

Thanks for your response! I'm at a conference this week, and I'm having some trouble immediately catching the segfault. I will have to come back to this, but I will collect the info as soon as possible.

In the meantime, I found the whole stack trace in my terminal scollback (I think it was from using fork, but I'm not 100% sure at this point. I think fork and spawn had the same trace, but I'm also not 100% sure and will need to check):

``` Process 17194 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGABRT frame #0: 0x00000001afe3b4d4 libsystem_kernel.dylib`__abort_with_payload + 8 libsystem_kernel.dylib`__abort_with_payload: -> 0x1afe3b4d4 <+8>: b.lo 0x1afe3b4f4 ; <+40> 0x1afe3b4d8 <+12>: pacibsp 0x1afe3b4dc <+16>: stp x29, x30, [sp, #-0x10]! 0x1afe3b4e0 <+20>: mov x29, sp Target 0: (python3.10) stopped. (lldb) bt * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGABRT * frame #0: 0x00000001afe3b4d4 libsystem_kernel.dylib`__abort_with_payload + 8 frame #1: 0x00000001afe3decc libsystem_kernel.dylib`abort_with_payload_wrapper_internal + 104 frame #2: 0x00000001afe3de64 libsystem_kernel.dylib`abort_with_reason + 32 frame #3: 0x00000001afcf8b40 libobjc.A.dylib`_objc_fatalv(unsigned long long, unsigned long long, char const*, char*) + 128 frame #4: 0x00000001afcf8ac0 libobjc.A.dylib`_objc_fatal(char const*, ...) + 44 frame #5: 0x00000001afcec614 libobjc.A.dylib`performForkChildInitialize(objc_class*, objc_class*) + 400 frame #6: 0x00000001afcd37c0 libobjc.A.dylib`initializeNonMetaClass + 496 frame #7: 0x00000001afcd3250 libobjc.A.dylib`initializeAndMaybeRelock(objc_class*, objc_object*, mutex_tt&, bool) + 184 frame #8: 0x00000001afcd2fe8 libobjc.A.dylib`lookUpImpOrForward + 1052 frame #9: 0x00000001afcd28e4 libobjc.A.dylib`_objc_msgSend_uncached + 68 frame #10: 0x00000001b0d9a628 Foundation`NSClassFromString + 64 frame #11: 0x00000001b2a3fe28 AppKit`+[NSColor(NSUIKitSupport) load] + 44 frame #12: 0x00000001afcd5398 libobjc.A.dylib`load_images + 828 frame #13: 0x000000010533bba0 dyld`dyld4::RuntimeState::notifyObjCInit(dyld4::Loader const*) + 164 frame #14: 0x000000010534202c dyld`dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array&) const + 204 frame #15: 0x0000000105342014 dyld`dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array&) const + 180 frame #16: 0x0000000105342014 dyld`dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array&) const + 180 frame #17: 0x0000000105342014 dyld`dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array&) const + 180 frame #18: 0x0000000105342104 dyld`dyld4::Loader::runInitializersBottomUpPlusUpwardLinks(dyld4::RuntimeState&) const + 124 frame #19: 0x0000000105351c64 dyld`dyld4::APIs::dlopen_from(char const*, int, void*) + 520 frame #20: 0x0000000104e42bfc python3.10`_PyImport_FindSharedFuncptr + 296 frame #21: 0x0000000104dfdd50 python3.10`_imp_create_dynamic + 1108 frame #22: 0x0000000104d35a78 python3.10`cfunction_vectorcall_FASTCALL + 200 frame #23: 0x0000000104dcd100 python3.10`_PyEval_EvalFrameDefault + 27300 frame #24: 0x0000000104dc5d50 python3.10`_PyEval_Vector + 2056 frame #25: 0x0000000104dd110c python3.10`call_function + 524 frame #26: 0x0000000104dccd88 python3.10`_PyEval_EvalFrameDefault + 26412 frame #27: 0x0000000104dc5d50 python3.10`_PyEval_Vector + 2056 frame #28: 0x0000000104dd110c python3.10`call_function + 524 frame #29: 0x0000000104dccd60 python3.10`_PyEval_EvalFrameDefault + 26372 frame #30: 0x0000000104dc5d50 python3.10`_PyEval_Vector + 2056 frame #31: 0x0000000104dd110c python3.10`call_function + 524 frame #32: 0x0000000104dccdf8 python3.10`_PyEval_EvalFrameDefault + 26524 frame #33: 0x0000000104dc5d50 python3.10`_PyEval_Vector + 2056 frame #34: 0x0000000104dd110c python3.10`call_function + 524 frame #35: 0x0000000104dccdf8 python3.10`_PyEval_EvalFrameDefault + 26524 frame #36: 0x0000000104dc5d50 python3.10`_PyEval_Vector + 2056 frame #37: 0x0000000104dd110c python3.10`call_function + 524 frame #38: 0x0000000104dccdf8 python3.10`_PyEval_EvalFrameDefault + 26524 frame #39: 0x0000000104dc5d50 python3.10`_PyEval_Vector + 2056 frame #40: 0x0000000104dd110c python3.10`call_function + 524 frame #41: 0x0000000104dccdf8 python3.10`_PyEval_EvalFrameDefault + 26524 frame #42: 0x0000000104dc5d50 python3.10`_PyEval_Vector + 2056 frame #43: 0x0000000104dd110c python3.10`call_function + 524 frame #44: 0x0000000104dccd88 python3.10`_PyEval_EvalFrameDefault + 26412 frame #45: 0x0000000104dc5d50 python3.10`_PyEval_Vector + 2056 frame #46: 0x0000000104dd110c python3.10`call_function + 524 frame #47: 0x0000000104dccd88 python3.10`_PyEval_EvalFrameDefault + 26412 frame #48: 0x0000000104dc5d50 python3.10`_PyEval_Vector + 2056 frame #49: 0x0000000104dc5518 python3.10`PyEval_EvalCode + 120 frame #50: 0x0000000104dc1190 python3.10`builtin_exec + 836 frame #51: 0x0000000104d35a78 python3.10`cfunction_vectorcall_FASTCALL + 200 frame #52: 0x0000000104dcd100 python3.10`_PyEval_EvalFrameDefault + 27300 frame #53: 0x0000000104dc5d50 python3.10`_PyEval_Vector + 2056 frame #54: 0x0000000104dd110c python3.10`call_function + 524 frame #55: 0x0000000104dccd88 python3.10`_PyEval_EvalFrameDefault + 26412 frame #56: 0x0000000104dc5d50 python3.10`_PyEval_Vector + 2056 frame #57: 0x0000000104dd110c python3.10`call_function + 524 frame #58: 0x0000000104dccd60 python3.10`_PyEval_EvalFrameDefault + 26372 frame #59: 0x0000000104dc5d50 python3.10`_PyEval_Vector + 2056 frame #60: 0x0000000104dd110c python3.10`call_function + 524 frame #61: 0x0000000104dccdf8 python3.10`_PyEval_EvalFrameDefault + 26524 frame #62: 0x0000000104dc5d50 python3.10`_PyEval_Vector + 2056 frame #63: 0x0000000104dd110c python3.10`call_function + 524 frame #64: 0x0000000104dccdf8 python3.10`_PyEval_EvalFrameDefault + 26524 frame #65: 0x0000000104dc5d50 python3.10`_PyEval_Vector + 2056 frame #66: 0x0000000104ce42f4 python3.10`object_vacall + 272 frame #67: 0x0000000104ce4514 python3.10`_PyObject_CallMethodIdObjArgs + 128 frame #68: 0x0000000104df9ddc python3.10`PyImport_ImportModuleLevelObject + 1284 frame #69: 0x0000000104dcb464 python3.10`_PyEval_EvalFrameDefault + 19976 frame #70: 0x0000000104dc5d50 python3.10`_PyEval_Vector + 2056 frame #71: 0x0000000104dc5518 python3.10`PyEval_EvalCode + 120 frame #72: 0x0000000104dc1190 python3.10`builtin_exec + 836 frame #73: 0x0000000104d35a78 python3.10`cfunction_vectorcall_FASTCALL + 200 frame #74: 0x0000000104dcd100 python3.10`_PyEval_EvalFrameDefault + 27300 frame #75: 0x0000000104dc5d50 python3.10`_PyEval_Vector + 2056 frame #76: 0x0000000104dd110c python3.10`call_function + 524 frame #77: 0x0000000104dccd88 python3.10`_PyEval_EvalFrameDefault + 26412 frame #78: 0x0000000104dc5d50 python3.10`_PyEval_Vector + 2056 frame #79: 0x0000000104dd110c python3.10`call_function + 524 frame #80: 0x0000000104dccd60 python3.10`_PyEval_EvalFrameDefault + 26372 frame #81: 0x0000000104dc5d50 python3.10`_PyEval_Vector + 2056 frame #82: 0x0000000104dd110c python3.10`call_function + 524 frame #83: 0x0000000104dccdf8 python3.10`_PyEval_EvalFrameDefault + 26524 frame #84: 0x0000000104dc5d50 python3.10`_PyEval_Vector + 2056 frame #85: 0x0000000104dd110c python3.10`call_function + 524 frame #86: 0x0000000104dccdf8 python3.10`_PyEval_EvalFrameDefault + 26524 frame #87: 0x0000000104dc5d50 python3.10`_PyEval_Vector + 2056 frame #88: 0x0000000104ce42f4 python3.10`object_vacall + 272 frame #89: 0x0000000104ce4514 python3.10`_PyObject_CallMethodIdObjArgs + 128 frame #90: 0x0000000104df9ddc python3.10`PyImport_ImportModuleLevelObject + 1284 frame #91: 0x0000000104dcb464 python3.10`_PyEval_EvalFrameDefault + 19976 frame #92: 0x0000000104dc5d50 python3.10`_PyEval_Vector + 2056 frame #93: 0x0000000104dd110c python3.10`call_function + 524 frame #94: 0x0000000104dcce68 python3.10`_PyEval_EvalFrameDefault + 26636 frame #95: 0x0000000104dc5d50 python3.10`_PyEval_Vector + 2056 frame #96: 0x0000000104ce2c30 python3.10`PyVectorcall_Call + 156 frame #97: 0x0000000104dcd100 python3.10`_PyEval_EvalFrameDefault + 27300 frame #98: 0x0000000104dc5d50 python3.10`_PyEval_Vector + 2056 frame #99: 0x0000000104ce2c30 python3.10`PyVectorcall_Call + 156 frame #100: 0x0000000104dcd100 python3.10`_PyEval_EvalFrameDefault + 27300 frame #101: 0x0000000104dc5d50 python3.10`_PyEval_Vector + 2056 frame #102: 0x0000000104ce2c30 python3.10`PyVectorcall_Call + 156 frame #103: 0x0000000104dcd100 python3.10`_PyEval_EvalFrameDefault + 27300 frame #104: 0x0000000104dc5d50 python3.10`_PyEval_Vector + 2056 frame #105: 0x0000000104e21078 python3.10`run_mod + 216 frame #106: 0x0000000104e23e0c python3.10`PyRun_StringFlags + 128 frame #107: 0x0000000104dc1154 python3.10`builtin_exec + 776 frame #108: 0x0000000104d35a78 python3.10`cfunction_vectorcall_FASTCALL + 200 frame #109: 0x0000000104dd110c python3.10`call_function + 524 frame #110: 0x0000000104dccdf8 python3.10`_PyEval_EvalFrameDefault + 26524 frame #111: 0x0000000104dc5d50 python3.10`_PyEval_Vector + 2056 frame #112: 0x0000000104dd110c python3.10`call_function + 524 frame #113: 0x0000000104dccdf8 python3.10`_PyEval_EvalFrameDefault + 26524 frame #114: 0x0000000104dc5d50 python3.10`_PyEval_Vector + 2056 frame #115: 0x0000000104dcd100 python3.10`_PyEval_EvalFrameDefault + 27300 frame #116: 0x0000000104dc5d50 python3.10`_PyEval_Vector + 2056 frame #117: 0x0000000104dcd100 python3.10`_PyEval_EvalFrameDefault + 27300 frame #118: 0x0000000104dc5d50 python3.10`_PyEval_Vector + 2056 frame #119: 0x0000000104dd110c python3.10`call_function + 524 frame #120: 0x0000000104dccd60 python3.10`_PyEval_EvalFrameDefault + 26372 frame #121: 0x0000000104dc5d50 python3.10`_PyEval_Vector + 2056 frame #122: 0x0000000104ce5aa4 python3.10`method_vectorcall + 164 frame #123: 0x0000000104dd110c python3.10`call_function + 524 frame #124: 0x0000000104dcce68 python3.10`_PyEval_EvalFrameDefault + 26636 frame #125: 0x0000000104dc5d50 python3.10`_PyEval_Vector + 2056 frame #126: 0x0000000104dd110c python3.10`call_function + 524 frame #127: 0x0000000104dccd60 python3.10`_PyEval_EvalFrameDefault + 26372 frame #128: 0x0000000104dc5d50 python3.10`_PyEval_Vector + 2056 frame #129: 0x0000000104ce26a0 python3.10`_PyObject_FastCallDictTstate + 320 frame #130: 0x0000000104ce3298 python3.10`_PyObject_Call_Prepend + 164 frame #131: 0x0000000104d595ec python3.10`slot_tp_init + 116 frame #132: 0x0000000104d51f20 python3.10`type_call + 340 frame #133: 0x0000000104ce23ec python3.10`_PyObject_MakeTpCall + 612 frame #134: 0x0000000104dd11a4 python3.10`call_function + 676 frame #135: 0x0000000104dccdf8 python3.10`_PyEval_EvalFrameDefault + 26524 frame #136: 0x0000000104dc5d50 python3.10`_PyEval_Vector + 2056 frame #137: 0x0000000104dd110c python3.10`call_function + 524 frame #138: 0x0000000104dccd88 python3.10`_PyEval_EvalFrameDefault + 26412 frame #139: 0x0000000104dc5d50 python3.10`_PyEval_Vector + 2056 frame #140: 0x0000000104dd110c python3.10`call_function + 524 frame #141: 0x0000000104dccd60 python3.10`_PyEval_EvalFrameDefault + 26372 frame #142: 0x0000000104dc5d50 python3.10`_PyEval_Vector + 2056 frame #143: 0x0000000104dd110c python3.10`call_function + 524 frame #144: 0x0000000104dccd60 python3.10`_PyEval_EvalFrameDefault + 26372 frame #145: 0x0000000104dc5d50 python3.10`_PyEval_Vector + 2056 frame #146: 0x0000000104e21078 python3.10`run_mod + 216 frame #147: 0x0000000104e20b18 python3.10`_PyRun_SimpleFileObject + 1264 frame #148: 0x0000000104e1faf0 python3.10`_PyRun_AnyFileObject + 240 frame #149: 0x0000000104e44b74 python3.10`Py_RunMain + 2340 frame #150: 0x0000000104e45d30 python3.10`pymain_main + 1272 frame #151: 0x0000000104c8c184 python3.10`main + 56 frame #152: 0x000000010533108c dyld`start + 520 ```

One stray question: for debugging, I believe I only have one core assignment, but I see three process_worker_pool.py processes. Is that expected? (ie. 3 processes for 1 task/core. edit: maybe it's 1 process/core + 2 overhead?). It makes attaching the debugger a bit more involved, so it would be nice if I could attach to just one. Thanks!

edit: I suppose this is a separate bug report, but for the parsl master, I've seen the logs blow up when it can't record the io resources. It grows to multiple GBs in a few minutes:

``` ERROR:root:Exception getting the resource usage. Not sending usage to Hub Traceback (most recent call last): File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/psutil/_common.py", line 443, in wrapper ret = self._cache[fun] AttributeError: 'Process' object has no attribute '_cache' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/psutil/_common.py", line 443, in wrapper ret = self._cache[fun] AttributeError: 'Process' object has no attribute '_cache' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/psutil/_psosx.py", line 346, in wrapper return fun(self, *args, **kwargs) File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/psutil/_common.py", line 446, in wrapper return fun(self) File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/psutil/_psosx.py", line 381, in _get_pidtaskinfo ret = cext.proc_pidtaskinfo_oneshot(self.pid) ProcessLookupError: [Errno 3] No such process (originated from proc_pidinfo()) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/parsl/monitoring/remote.py", line 255, in monitor d = accumulate_and_prepare() File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/parsl/monitoring/remote.py", line 208, in accumulate_and_prepare d['psutil_process_memory_virtual'] = pm.memory_info().vms File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/psutil/_common.py", line 446, in wrapper return fun(self) File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/psutil/__init__.py", line 1058, in memory_info return self._proc.memory_info() File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/psutil/_psosx.py", line 346, in wrapper return fun(self, *args, **kwargs) File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/psutil/_psosx.py", line 446, in memory_info rawtuple = self._get_pidtaskinfo() File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/psutil/_psosx.py", line 349, in wrapper raise ZombieProcess(self.pid, self._name, self._ppid) psutil.ZombieProcess: PID still exists but it's a zombie (pid=34637, ppid=34632, name='python3.10') ERROR:root:Exception getting the resource usage. Not sending usage to Hub Traceback (most recent call last): File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/psutil/_common.py", line 443, in wrapper ret = self._cache[fun] AttributeError: 'Process' object has no attribute '_cache' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/psutil/_common.py", line 443, in wrapper ret = self._cache[fun] AttributeError: 'Process' object has no attribute '_cache' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/psutil/_psosx.py", line 346, in wrapper return fun(self, *args, **kwargs) File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/psutil/_common.py", line 446, in wrapper return fun(self) File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/psutil/_psosx.py", line 381, in _get_pidtaskinfo ret = cext.proc_pidtaskinfo_oneshot(self.pid) ProcessLookupError: [Errno 3] No such process (originated from proc_pidinfo()) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/parsl/monitoring/remote.py", line 255, in monitor d = accumulate_and_prepare() File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/parsl/monitoring/remote.py", line 208, in accumulate_and_prepare d['psutil_process_memory_virtual'] = pm.memory_info().vms File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/psutil/_common.py", line 446, in wrapper return fun(self) File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/psutil/__init__.py", line 1058, in memory_info return self._proc.memory_info() File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/psutil/_psosx.py", line 346, in wrapper return fun(self, *args, **kwargs) File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/psutil/_psosx.py", line 446, in memory_info rawtuple = self._get_pidtaskinfo() File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/psutil/_psosx.py", line 349, in wrapper raise ZombieProcess(self.pid, self._name, self._ppid) psutil.ZombieProcess: PID still exists but it's a zombie (pid=34637, ppid=34632, name='python3.10') ERROR:root:Exception getting the resource usage. Not sending usage to Hub Traceback (most recent call last): File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/psutil/_common.py", line 443, in wrapper ret = self._cache[fun] AttributeError: 'Process' object has no attribute '_cache' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/psutil/_common.py", line 443, in wrapper ret = self._cache[fun] AttributeError: 'Process' object has no attribute '_cache' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/psutil/_psosx.py", line 346, in wrapper return fun(self, *args, **kwargs) File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/psutil/_common.py", line 446, in wrapper return fun(self) File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/psutil/_psosx.py", line 381, in _get_pidtaskinfo ret = cext.proc_pidtaskinfo_oneshot(self.pid) ProcessLookupError: [Errno 3] No such process (originated from proc_pidinfo()) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/parsl/monitoring/remote.py", line 255, in monitor d = accumulate_and_prepare() File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/parsl/monitoring/remote.py", line 208, in accumulate_and_prepare d['psutil_process_memory_virtual'] = pm.memory_info().vms File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/psutil/_common.py", line 446, in wrapper return fun(self) File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/psutil/__init__.py", line 1058, in memory_info return self._proc.memory_info() File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/psutil/_psosx.py", line 346, in wrapper return fun(self, *args, **kwargs) File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/psutil/_psosx.py", line 446, in memory_info rawtuple = self._get_pidtaskinfo() File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/psutil/_psosx.py", line 349, in wrapper raise ZombieProcess(self.pid, self._name, self._ppid) psutil.ZombieProcess: PID still exists but it's a zombie (pid=34637, ppid=34632, name='python3.10') ERROR:root:Exception getting the resource usage. Not sending usage to Hub Traceback (most recent call last): File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/psutil/_common.py", line 443, in wrapper ret = self._cache[fun] AttributeError: 'Process' object has no attribute '_cache' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/psutil/_common.py", line 443, in wrapper ret = self._cache[fun] AttributeError: 'Process' object has no attribute '_cache' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/psutil/_psosx.py", line 346, in wrapper return fun(self, *args, **kwargs) File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/psutil/_common.py", line 446, in wrapper return fun(self) File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/psutil/_psosx.py", line 381, in _get_pidtaskinfo ret = cext.proc_pidtaskinfo_oneshot(self.pid) ProcessLookupError: [Errno 3] No such process (originated from proc_pidinfo()) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/parsl/monitoring/remote.py", line 255, in monitor d = accumulate_and_prepare() File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/parsl/monitoring/remote.py", line 208, in accumulate_and_prepare d['psutil_process_memory_virtual'] = pm.memory_info().vms File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/psutil/_common.py", line 446, in wrapper return fun(self) File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/psutil/__init__.py", line 1058, in memory_info return self._proc.memory_info() File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/psutil/_psosx.py", line 346, in wrapper return fun(self, *args, **kwargs) File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/psutil/_psosx.py", line 446, in memory_info rawtuple = self._get_pidtaskinfo() File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/psutil/_psosx.py", line 349, in wrapper raise ZombieProcess(self.pid, self._name, self._ppid) psutil.ZombieProcess: PID still exists but it's a zombie (pid=34637, ppid=34632, name='python3.10') ERROR:root:Exception getting the resource usage. Not sending usage to Hub Traceback (most recent call last): File "/opt/homebrew/Caskroom/mambaforge/base/envs/substructure_c_24_06/lib/python3.10/site-packages/psutil/_common.py", line 443, in wrapper ret = self._cache[fun] AttributeError: 'Process' object has no attribute '_cache' .....truncated ```
raymondEhlers commented 1 year ago

I tried again and was able to catch the segfault again. The trace when using "spawn" is below:

``` Process 88731 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGABRT frame #0: 0x00000001afe3b4d4 libsystem_kernel.dylib`__abort_with_payload + 8 libsystem_kernel.dylib`__abort_with_payload: -> 0x1afe3b4d4 <+8>: b.lo 0x1afe3b4f4 ; <+40> 0x1afe3b4d8 <+12>: pacibsp 0x1afe3b4dc <+16>: stp x29, x30, [sp, #-0x10]! 0x1afe3b4e0 <+20>: mov x29, sp Target 0: (python3.10) stopped. (lldb) bt * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGABRT * frame #0: 0x00000001afe3b4d4 libsystem_kernel.dylib`__abort_with_payload + 8 frame #1: 0x00000001afe3decc libsystem_kernel.dylib`abort_with_payload_wrapper_internal + 104 frame #2: 0x00000001afe3de64 libsystem_kernel.dylib`abort_with_reason + 32 frame #3: 0x00000001afcf8b40 libobjc.A.dylib`_objc_fatalv(unsigned long long, unsigned long long, char const*, char*) + 128 frame #4: 0x00000001afcf8ac0 libobjc.A.dylib`_objc_fatal(char const*, ...) + 44 frame #5: 0x00000001afcec614 libobjc.A.dylib`performForkChildInitialize(objc_class*, objc_class*) + 400 frame #6: 0x00000001afcd37c0 libobjc.A.dylib`initializeNonMetaClass + 496 frame #7: 0x00000001afcd3250 libobjc.A.dylib`initializeAndMaybeRelock(objc_class*, objc_object*, mutex_tt&, bool) + 184 frame #8: 0x00000001afcd2fe8 libobjc.A.dylib`lookUpImpOrForward + 1052 frame #9: 0x00000001afcd28e4 libobjc.A.dylib`_objc_msgSend_uncached + 68 frame #10: 0x00000001b0d9c828 Foundation`_NSResolveSymlinksInPathUsingCache + 124 frame #11: 0x00000001b0d9c634 Foundation`-[NSString(NSPathUtilities) _stringByResolvingSymlinksInPathUsingCache:] + 152 frame #12: 0x00000001b0d9b188 Foundation`-[NSBundle initWithPath:] + 184 frame #13: 0x00000001b0d9afb8 Foundation`+[NSBundle mainBundle] + 148 frame #14: 0x00000001b2a3f978 AppKit`+[NSApplication load] + 84 frame #15: 0x00000001afcd5398 libobjc.A.dylib`load_images + 828 frame #16: 0x00000001046bfba0 dyld`dyld4::RuntimeState::notifyObjCInit(dyld4::Loader const*) + 164 frame #17: 0x00000001046c602c dyld`dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array&) const + 204 frame #18: 0x00000001046c6014 dyld`dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array&) const + 180 frame #19: 0x00000001046c6014 dyld`dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array&) const + 180 frame #20: 0x00000001046c6014 dyld`dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array&) const + 180 frame #21: 0x00000001046c6104 dyld`dyld4::Loader::runInitializersBottomUpPlusUpwardLinks(dyld4::RuntimeState&) const + 124 frame #22: 0x00000001046d5c64 dyld`dyld4::APIs::dlopen_from(char const*, int, void*) + 520 frame #23: 0x0000000104416bfc python3.10`_PyImport_FindSharedFuncptr + 296 frame #24: 0x00000001043d1d50 python3.10`_imp_create_dynamic + 1108 frame #25: 0x0000000104309a78 python3.10`cfunction_vectorcall_FASTCALL + 200 frame #26: 0x00000001043a1100 python3.10`_PyEval_EvalFrameDefault + 27300 frame #27: 0x0000000104399d50 python3.10`_PyEval_Vector + 2056 frame #28: 0x00000001043a510c python3.10`call_function + 524 frame #29: 0x00000001043a0d88 python3.10`_PyEval_EvalFrameDefault + 26412 frame #30: 0x0000000104399d50 python3.10`_PyEval_Vector + 2056 frame #31: 0x00000001043a510c python3.10`call_function + 524 frame #32: 0x00000001043a0d60 python3.10`_PyEval_EvalFrameDefault + 26372 frame #33: 0x0000000104399d50 python3.10`_PyEval_Vector + 2056 frame #34: 0x00000001043a510c python3.10`call_function + 524 frame #35: 0x00000001043a0df8 python3.10`_PyEval_EvalFrameDefault + 26524 frame #36: 0x0000000104399d50 python3.10`_PyEval_Vector + 2056 frame #37: 0x00000001043a510c python3.10`call_function + 524 frame #38: 0x00000001043a0df8 python3.10`_PyEval_EvalFrameDefault + 26524 frame #39: 0x0000000104399d50 python3.10`_PyEval_Vector + 2056 frame #40: 0x00000001043a510c python3.10`call_function + 524 frame #41: 0x00000001043a0df8 python3.10`_PyEval_EvalFrameDefault + 26524 frame #42: 0x0000000104399d50 python3.10`_PyEval_Vector + 2056 frame #43: 0x00000001043a510c python3.10`call_function + 524 frame #44: 0x00000001043a0df8 python3.10`_PyEval_EvalFrameDefault + 26524 frame #45: 0x0000000104399d50 python3.10`_PyEval_Vector + 2056 frame #46: 0x00000001043a510c python3.10`call_function + 524 frame #47: 0x00000001043a0d88 python3.10`_PyEval_EvalFrameDefault + 26412 frame #48: 0x0000000104399d50 python3.10`_PyEval_Vector + 2056 frame #49: 0x00000001043a510c python3.10`call_function + 524 frame #50: 0x00000001043a0d88 python3.10`_PyEval_EvalFrameDefault + 26412 frame #51: 0x0000000104399d50 python3.10`_PyEval_Vector + 2056 frame #52: 0x0000000104399518 python3.10`PyEval_EvalCode + 120 frame #53: 0x0000000104395190 python3.10`builtin_exec + 836 frame #54: 0x0000000104309a78 python3.10`cfunction_vectorcall_FASTCALL + 200 frame #55: 0x00000001043a1100 python3.10`_PyEval_EvalFrameDefault + 27300 frame #56: 0x0000000104399d50 python3.10`_PyEval_Vector + 2056 frame #57: 0x00000001043a510c python3.10`call_function + 524 frame #58: 0x00000001043a0d88 python3.10`_PyEval_EvalFrameDefault + 26412 frame #59: 0x0000000104399d50 python3.10`_PyEval_Vector + 2056 frame #60: 0x00000001043a510c python3.10`call_function + 524 frame #61: 0x00000001043a0d60 python3.10`_PyEval_EvalFrameDefault + 26372 frame #62: 0x0000000104399d50 python3.10`_PyEval_Vector + 2056 frame #63: 0x00000001043a510c python3.10`call_function + 524 frame #64: 0x00000001043a0df8 python3.10`_PyEval_EvalFrameDefault + 26524 frame #65: 0x0000000104399d50 python3.10`_PyEval_Vector + 2056 frame #66: 0x00000001043a510c python3.10`call_function + 524 frame #67: 0x00000001043a0df8 python3.10`_PyEval_EvalFrameDefault + 26524 frame #68: 0x0000000104399d50 python3.10`_PyEval_Vector + 2056 frame #69: 0x00000001042b82f4 python3.10`object_vacall + 272 frame #70: 0x00000001042b8514 python3.10`_PyObject_CallMethodIdObjArgs + 128 frame #71: 0x00000001043cdddc python3.10`PyImport_ImportModuleLevelObject + 1284 frame #72: 0x000000010439f464 python3.10`_PyEval_EvalFrameDefault + 19976 frame #73: 0x0000000104399d50 python3.10`_PyEval_Vector + 2056 frame #74: 0x0000000104399518 python3.10`PyEval_EvalCode + 120 frame #75: 0x0000000104395190 python3.10`builtin_exec + 836 frame #76: 0x0000000104309a78 python3.10`cfunction_vectorcall_FASTCALL + 200 frame #77: 0x00000001043a1100 python3.10`_PyEval_EvalFrameDefault + 27300 frame #78: 0x0000000104399d50 python3.10`_PyEval_Vector + 2056 frame #79: 0x00000001043a510c python3.10`call_function + 524 frame #80: 0x00000001043a0d88 python3.10`_PyEval_EvalFrameDefault + 26412 frame #81: 0x0000000104399d50 python3.10`_PyEval_Vector + 2056 frame #82: 0x00000001043a510c python3.10`call_function + 524 frame #83: 0x00000001043a0d60 python3.10`_PyEval_EvalFrameDefault + 26372 frame #84: 0x0000000104399d50 python3.10`_PyEval_Vector + 2056 frame #85: 0x00000001043a510c python3.10`call_function + 524 frame #86: 0x00000001043a0df8 python3.10`_PyEval_EvalFrameDefault + 26524 frame #87: 0x0000000104399d50 python3.10`_PyEval_Vector + 2056 frame #88: 0x00000001043a510c python3.10`call_function + 524 frame #89: 0x00000001043a0df8 python3.10`_PyEval_EvalFrameDefault + 26524 frame #90: 0x0000000104399d50 python3.10`_PyEval_Vector + 2056 frame #91: 0x00000001042b82f4 python3.10`object_vacall + 272 frame #92: 0x00000001042b8514 python3.10`_PyObject_CallMethodIdObjArgs + 128 frame #93: 0x00000001043cdddc python3.10`PyImport_ImportModuleLevelObject + 1284 frame #94: 0x000000010439f464 python3.10`_PyEval_EvalFrameDefault + 19976 frame #95: 0x0000000104399d50 python3.10`_PyEval_Vector + 2056 frame #96: 0x00000001043a510c python3.10`call_function + 524 frame #97: 0x00000001043a0e68 python3.10`_PyEval_EvalFrameDefault + 26636 frame #98: 0x0000000104399d50 python3.10`_PyEval_Vector + 2056 frame #99: 0x00000001042b6c30 python3.10`PyVectorcall_Call + 156 frame #100: 0x00000001043a1100 python3.10`_PyEval_EvalFrameDefault + 27300 frame #101: 0x0000000104399d50 python3.10`_PyEval_Vector + 2056 frame #102: 0x00000001042b6c30 python3.10`PyVectorcall_Call + 156 frame #103: 0x00000001043a1100 python3.10`_PyEval_EvalFrameDefault + 27300 frame #104: 0x0000000104399d50 python3.10`_PyEval_Vector + 2056 frame #105: 0x00000001042b6c30 python3.10`PyVectorcall_Call + 156 frame #106: 0x00000001043a1100 python3.10`_PyEval_EvalFrameDefault + 27300 frame #107: 0x0000000104399d50 python3.10`_PyEval_Vector + 2056 frame #108: 0x00000001043f5078 python3.10`run_mod + 216 frame #109: 0x00000001043f7e0c python3.10`PyRun_StringFlags + 128 frame #110: 0x0000000104395154 python3.10`builtin_exec + 776 frame #111: 0x0000000104309a78 python3.10`cfunction_vectorcall_FASTCALL + 200 frame #112: 0x00000001043a510c python3.10`call_function + 524 frame #113: 0x00000001043a0df8 python3.10`_PyEval_EvalFrameDefault + 26524 frame #114: 0x0000000104399d50 python3.10`_PyEval_Vector + 2056 frame #115: 0x00000001043a510c python3.10`call_function + 524 frame #116: 0x00000001043a0df8 python3.10`_PyEval_EvalFrameDefault + 26524 frame #117: 0x0000000104399d50 python3.10`_PyEval_Vector + 2056 frame #118: 0x00000001043a1100 python3.10`_PyEval_EvalFrameDefault + 27300 frame #119: 0x0000000104399d50 python3.10`_PyEval_Vector + 2056 frame #120: 0x00000001043a1100 python3.10`_PyEval_EvalFrameDefault + 27300 frame #121: 0x0000000104399d50 python3.10`_PyEval_Vector + 2056 frame #122: 0x00000001043a510c python3.10`call_function + 524 frame #123: 0x00000001043a0d60 python3.10`_PyEval_EvalFrameDefault + 26372 frame #124: 0x0000000104399d50 python3.10`_PyEval_Vector + 2056 frame #125: 0x00000001042b9aa4 python3.10`method_vectorcall + 164 frame #126: 0x00000001043a510c python3.10`call_function + 524 frame #127: 0x00000001043a0e68 python3.10`_PyEval_EvalFrameDefault + 26636 frame #128: 0x0000000104399d50 python3.10`_PyEval_Vector + 2056 frame #129: 0x00000001043a510c python3.10`call_function + 524 frame #130: 0x00000001043a0d60 python3.10`_PyEval_EvalFrameDefault + 26372 frame #131: 0x0000000104399d50 python3.10`_PyEval_Vector + 2056 frame #132: 0x00000001042b66a0 python3.10`_PyObject_FastCallDictTstate + 320 frame #133: 0x00000001042b7298 python3.10`_PyObject_Call_Prepend + 164 frame #134: 0x000000010432d5ec python3.10`slot_tp_init + 116 frame #135: 0x0000000104325f20 python3.10`type_call + 340 frame #136: 0x00000001042b63ec python3.10`_PyObject_MakeTpCall + 612 frame #137: 0x00000001043a51a4 python3.10`call_function + 676 frame #138: 0x00000001043a0df8 python3.10`_PyEval_EvalFrameDefault + 26524 frame #139: 0x0000000104399d50 python3.10`_PyEval_Vector + 2056 frame #140: 0x00000001043a510c python3.10`call_function + 524 frame #141: 0x00000001043a0d88 python3.10`_PyEval_EvalFrameDefault + 26412 frame #142: 0x0000000104399d50 python3.10`_PyEval_Vector + 2056 frame #143: 0x00000001043a510c python3.10`call_function + 524 frame #144: 0x00000001043a0d60 python3.10`_PyEval_EvalFrameDefault + 26372 frame #145: 0x0000000104399d50 python3.10`_PyEval_Vector + 2056 frame #146: 0x00000001043a510c python3.10`call_function + 524 frame #147: 0x00000001043a0d60 python3.10`_PyEval_EvalFrameDefault + 26372 frame #148: 0x0000000104399d50 python3.10`_PyEval_Vector + 2056 frame #149: 0x00000001043f5078 python3.10`run_mod + 216 frame #150: 0x00000001043f4b18 python3.10`_PyRun_SimpleFileObject + 1264 frame #151: 0x00000001043f3af0 python3.10`_PyRun_AnyFileObject + 240 frame #152: 0x0000000104418b74 python3.10`Py_RunMain + 2340 frame #153: 0x0000000104419d30 python3.10`pymain_main + 1272 frame #154: 0x0000000104260184 python3.10`main + 56 frame #155: 0x00000001046b508c dyld`start + 520 ```

As far as I can tell, this seems to be identical to what I posted before with fork. I found more in Console, which has the following information:

System Integrity Protection: enabled

Crashed Thread:        0  Dispatch queue: com.apple.main-thread

Exception Type:        EXC_CRASH (SIGABRT)
Exception Codes:       0x0000000000000000, 0x0000000000000000
Exception Note:        EXC_CORPSE_NOTIFY

Termination Reason:    Namespace OBJC, Code 1 

Application Specific Information:
*** multi-threaded process forked ***
crashed on child side of fork pre-exec

Kernel Triage:
VM - pmap_enter failed with resource shortage
VM - pmap_enter failed with resource shortage
VM - pmap_enter failed with resource shortage
VM - pmap_enter failed with resource shortage
VM - pmap_enter failed with resource shortage

<And then the stack trace that I copied above is here>

I unfortunately wasn't able to find this definition searching around, but I guess based on what you found above, this should be enough to find it? This may be it? https://github.com/showxu/objc4/blob/b73f5d4700db192ffdc91b5ead36f3ddf8bfe174/objc4/runtime/objc-internal.h#L54 If so, it doesn't seem especially helpful :-(