Open abhishek-rn opened 1 year ago
Hi @abhishek-rn
Thanks for the report. This transition from 23.05 to 23.06 marks the move from PyTorch 1.x to 2.x, so it looks like we may have lost some functionality at this stage.
Would you be able to confirm if the same behaviour is present if you use the pip install
ed pytorch packages for 1.3 and 2.0 on AArch64, and also on x86?
Hi @nSircombe The Docker tag read r23.05-torch-2.0.0-onednn-acl. So, I thought that would mean torch-2.0.0. However, I ran the pip installed pytorch 2.0.0 and 1.13 and PFB the logs: ARM_PyT_1.13_Bert_Verbose.txt ARM_PyT_2.0.0_Bert_Verbose.txt
And the results there show that PyT 1.13 has no ACL calls but PyT 2.0.0 has.
x86_PyT_1.13_Bert_Verbose.txt x86_PyT_2.0.0_Bert_Verbose.txt
Also, x86 PyTorch do not have oneDNN calls for Matmuls as seen in the above logs
Yes you're right, the version is 2.0. The tag is correct - matches the version in the Dockerfile. The mistake is in the README for the 23.05 increment here which still has 1.3.
Hi,
Docker Tags: r23.09-torch-2.0.0-onednn-acl r23.05-torch-2.0.0-onednn-acl
I am unable to get acl calls in docker versions higher than 23.05 for Pytorch Hugging Face Models
Attaching oneDNN verbose calls for BERT model here 23.05_Bert_Verbose.txt 23.09_Bert_Verbose.txt
The code to reproduce this is attached as below: PyT_Bert_Training.txt --> Use this for the first run to generate necessary inference checkpoints and files. PyT_Bert_Inf.txt --> For subsequent runs to generate the oneDNN logs
Also, as a result, the later oneDNN verbose exhibits gemm:jit calls for Matmuls and this results in poor performance for inference compared to gemm:acl calls.
Thanks