intel / torch-xpu-ops

Apache License 2.0
31 stars 22 forks source link

[BF16]For LayoutLMForSequenceClassification model on stock pytorch, div cost time on pvc-1100 worse than A100 * ratio #803

Open xiaowangintel opened 3 months ago

xiaowangintel commented 3 months ago

🐛 Describe the bug

For more details, please refer to https://jira.devtools.intel.com/browse/PYTORCHDGQ-5072?filter=-2.

Versions

pytorch commit:03480213dea1f60f6d12e7348904d2f3ef7314d0 torch-xpu-ops commit:718bc42c667539977e5eadb11ea4dec602544bf2 driver : hotfix_agama-ci-devel-881.19 pti : l_intel-pti-dev_p_0.9.0.38_offline.sh basekit : l_BaseKit_p_2024.2.1.100_offline.sh

retonym commented 1 week ago

xpu performance is not targeted to PT 2.6