pytorch / pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration
https://pytorch.org
Other
82.31k stars 22.14k forks source link

decorate_context results in uninformative backtraces #118747

Open ezyang opened 7 months ago

ezyang commented 7 months ago

šŸ› Describe the bug

When I use decorate_context to convert a context manager into a decorator, I only ever see the generic decorate_context in stack traces. This sucks, because different context managers can be quite different, and I would quite like to know what the actual context manager I was running was.

Example uninformative stack:

[rank7]:[2024-01-25 11:27:03,298] [3/1] torch._dynamo.symbolic_convert: [INFO]     optimizer.step()
[rank7]:[2024-01-25 11:27:03,298] [3/1] torch._dynamo.symbolic_convert: [INFO]   File "/packages/training_platform/worker-inplace#link-tree/torch/optim/optimizer.py", line 391, in wrapper
[rank7]:[2024-01-25 11:27:03,298] [3/1] torch._dynamo.symbolic_convert: [INFO]     out = func(*args, **kwargs)
[rank7]:[2024-01-25 11:27:03,298] [3/1] torch._dynamo.symbolic_convert: [INFO]   File "/packages/training_platform/worker-inplace#link-tree/torch/utils/_contextlib.py", line 115, in decorate_context
[rank7]:[2024-01-25 11:27:03,298] [3/1] torch._dynamo.symbolic_convert: [INFO]     return func(*args, **kwargs)
[rank7]:[2024-01-25 11:27:03,298] [3/1] torch._dynamo.symbolic_convert: [INFO]   File "<torch_package_0>.caffe2/torch/fb/optim/shampoo_v1/distributed_shampoo.py", line 1334, in step
[rank7]:[2024-01-25 11:27:03,298] [3/1] torch._dynamo.symbolic_convert: [INFO]     self._split_foreach_weightdecay_kernel(
[rank7]:[2024-01-25 11:27:03,298] [3/1] torch._dynamo.symbolic_convert: [INFO]   File "/packages/training_platform/worker-inplace#link-tree/torch/utils/_contextlib.py", line 115, in decorate_context
[rank7]:[2024-01-25 11:27:03,298] [3/1] torch._dynamo.symbolic_convert: [INFO]     return func(*args, **kwargs)
[rank7]:[2024-01-25 11:27:03,298] [3/1] torch._dynamo.symbolic_convert: [INFO]   File "/packages/training_platform/worker-inplace#link-tree/torch/_dynamo/eval_frame.py", line 417, in _fn

Versions

main

cc @msaroufim @bdhirsh @anijain2305 @zou3519

BoyuanFeng commented 1 week ago

Hi @ezyang, we are scrubbing old issues. Is this one still applicable?

ezyang commented 1 week ago

Yes, still applicable.