After I ran the example in README with depyf, there are multiple files in target directory.
├── __compiled_fn_1 AFTER POST GRAD 0.py
├── __compiled_fn_1 Captured Graph 0.py
├── __compiled_fn_1 Forward graph 0.py
├── __compiled_fn_1 kernel 0.py
├── __compiled_fn_1 kernel 1.py
├── __compiled_fn_1 kernel 2.py
├── __compiled_fn_5 AFTER POST GRAD 0.py
├── __compiled_fn_5 Captured Graph 0.py
├── __compiled_fn_5 Forward graph 0.py
├── __compiled_fn_5 kernel 0.py
├── __compiled_fn_5 kernel 1.py
├── full_code_for_toy_example_0.py
├── __transformed_code_0_for_torch_dynamo_resume_in_toy_example_at_9.py
└── __transformed_code_0_for_toy_example.py
Why does torch.compile dump __compiled_fn_1 kernel 1.py and __compiled_fn_1 kernel 2.py while dumping __compiled_fn_1 kernel 0.py? Since the latter already contains the string form of the first two Triton kernels?
All stuff in depyf works fine.
After I ran the example in README with depyf, there are multiple files in target directory.
Why does
torch.compile
dump__compiled_fn_1 kernel 1.py
and__compiled_fn_1 kernel 2.py
while dumping__compiled_fn_1 kernel 0.py
? Since the latter already contains the string form of the first two Triton kernels?