Closed LHW-CLOUD closed 3 months ago
Hello,
Could you please share a small reproducible example so I can diagnose the issues? Quite often, users incorrectly take into account the compilation time when benchmarking their code, but there are indeed edge cases where Triton is slower than PyTorch.
Thanks a lot! I seem to have noticed a previous issue where there was a logical error in my code. But I have a new question to ask you. I want to save Triton ir now, and I found that the Triton compiler defaults to saving all intermediate files under
/home/. triton/dump It seems that each folder corresponds to the intermediate representation of an operator (I don't know if my understanding is correct). Is there a way to integrate all IR files into one IR file and use this complete IR file to run the inference results of the model? Thank you!
I am unfortunately ill-equipped to answer your question as I have not worked with Triton IR in the past. I suggest you open up an issue on the Triton repository since this issue does not directly concern attorch.
thank you!
Why does the model inference speed become very slow after converting the model operator into the Triton form。I don't know if it's due to caching