Open JBloodless opened 7 months ago
The key is this part of the stack trace:
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/export/init.py", line 174, in export return _export(
This says that the native crash is happening in pytorch. We can:
For 2, replace aot.export with an equivalent call to torch.export.export. it should crash in the same way (that is the first thing our export does). Then that can be filed and further debugged with pytorch.
I'd recommend trying the latest torch nightly when checking the repro. Many times these things get fixed.
The weird thing is that if I comment out last string with .compile
, code runs fine with both aot.export
and torch.export.export
. The error happens only when binary = export_output.compile(save_to=None)
is present. Moreover, if I try to debug code without binary = export_output.compile(save_to=None)
, debugger will crash (although usual run was fine), and debug log is the same as with compile
.
Here's debug log with torch.export.export
(it's the same with aot.export
, since aot.export
crashes on torch.export.export
call, just as you said)
Traceback (most recent call last):
File "/Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/pydevd.py", line 2195, in <module>
main()
File "/Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/pydevd.py", line 2177, in main
globals = debugger.run(setup['file'], None, None, is_module)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/pydevd.py", line 1489, in run
return self._exec(is_module, entry_point_fn, module_name, file, globals, locals)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/pydevd.py", line 1496, in _exec
pydev_imports.execfile(file, globals, locals) # execute the script
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/Users/i.beskrovnyy/tts/NISQA-s/repros/repro_pool.py", line 55, in <module>
export_output = torch.export.export(model, example_input)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/export/__init__.py", line 174, in export
return _export(
^^^^^^^^
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/export/_trace.py", line 836, in wrapper
raise e
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/export/_trace.py", line 819, in wrapper
ep = fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/export/exported_program.py", line 85, in wrapper
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/export/_trace.py", line 1072, in _export
gm_torch_level = _export_to_torch_ir(
^^^^^^^^^^^^^^^^^^^^
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/export/_trace.py", line 430, in _export_to_torch_ir
gm_torch_level, _ = torch._dynamo.export(
^^^^^^^^^^^^^^^^^^^^^
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/_dynamo/eval_frame.py", line 1237, in inner
result_traced = opt_f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/_dynamo/eval_frame.py", line 410, in _fn
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/i.beskrovnyy/tts/NISQA-s/repros/repro_pool.py", line 39, in forward
def forward(self, x, n_wins):
File "/Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/_pydevd_bundle/pydevd_frame.py", line 164, in trace_return
send_signature_return_trace(main_debugger, frame, filename, arg)
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 976, in catch_errors
return callback(frame, cache_entry, hooks, frame_state, skip=1)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 411, in _convert_frame_assert
return _compile(
^^^^^^^^^
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/_utils_internal.py", line 70, in wrapper_function
return function(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/contextlib.py", line 81, in inner
return func(*args, **kwds)
^^^^^^^^^^^^^^^^^^^
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 698, in _compile
guarded_code = compile_inner(code, one_graph, hooks, transform)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/_dynamo/utils.py", line 265, in time_wrapper
r = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 553, in compile_inner
out_code = transform_code_object(code, transform)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/_dynamo/bytecode_transformation.py", line 1113, in transform_code_object
transformations(instructions, code_options)
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 173, in _fn
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 515, in transform
tracer.run()
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 2201, in run
super().run()
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 857, in run
while self.step():
^^^^^^^^^^^
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 767, in step
self.dispatch_table[inst.opcode](self, inst)
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 491, in wrapper
return inner_fn(self, inst)
^^^^^^^^^^^^^^^^^^^^
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 1828, in CALL
self.call_function(fn, args, kwargs)
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 707, in call_function
self.push(fn.call_function(self, args, kwargs))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/_dynamo/variables/functions.py", line 339, in call_function
return super().call_function(tx, args, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/_dynamo/variables/functions.py", line 293, in call_function
return super().call_function(tx, args, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/_dynamo/variables/functions.py", line 90, in call_function
return tx.inline_user_function_return(self, [*self.self_args(), *args], kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 713, in inline_user_function_return
return InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 2361, in inline_call
return cls.inline_call_(parent, func, args, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 2477, in inline_call_
tracer.run()
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 857, in run
while self.step():
^^^^^^^^^^^
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 767, in step
self.dispatch_table[inst.opcode](self, inst)
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 491, in wrapper
return inner_fn(self, inst)
^^^^^^^^^^^^^^^^^^^^
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 1828, in CALL
self.call_function(fn, args, kwargs)
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 707, in call_function
self.push(fn.call_function(self, args, kwargs))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/_dynamo/variables/functions.py", line 339, in call_function
return super().call_function(tx, args, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/_dynamo/variables/functions.py", line 293, in call_function
return super().call_function(tx, args, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/_dynamo/variables/functions.py", line 90, in call_function
return tx.inline_user_function_return(self, [*self.self_args(), *args], kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 713, in inline_user_function_return
return InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 2361, in inline_call
return cls.inline_call_(parent, func, args, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 2477, in inline_call_
tracer.run()
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 857, in run
while self.step():
^^^^^^^^^^^
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 767, in step
self.dispatch_table[inst.opcode](self, inst)
File "/Users/i.beskrovnyy/anaconda3/envs/shark_re/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 469, in inner
raise exc.UserError(
torch._dynamo.exc.UserError: Dynamic control flow is not supported at the moment. Please use functorch.experimental.control_flow.cond to explicitly capture the control flow. For more information about this error, see: https://pytorch.org/docs/main/generated/exportdb/index.html#cond-operands
from user code:
File "/Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/_pydevd_bundle/pydevd_signature.py", line 198, in send_signature_return_trace
signature = dbg.signature_factory.create_signature(frame, filename, with_args=False)
File "/Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/_pydevd_bundle/pydevd_signature.py", line 97, in create_signature
_, modulename, funcname = self.file_module_function_of(frame)
File "/Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/_pydevd_bundle/pydevd_signature.py", line 110, in file_module_function_of
if filename:
Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information
P.S. Torch version 2.4.0.dev20240411
Native crashes can be a bit tricky because what actually crashed doesn't always correlate to the python trace. Dynamo has gotten better in its failure cases, but I still do see crashes and bugs that come from dynamo trying to report a high level error message... And then that leads you down a rabbit hole because you end up debugging the error message code vs the root cause.
For native crashes, the gold standard is a gdb backtrace with torch and iree binaries that have debug symbols. But even just a bt on normal release binaries is often enough to route the issue to the right high level component.
Hi. I'm trying to export custom torch layer, but I'm getting sigsegv on usual runs and uninformative output in debugger. Here's the repro:
Here's debugger log:
I'm using the latest version of turbine. Which part of my code causing this? Maybe I can just replace some operation, but I'm completely lost on which one.