intel / torch-xpu-ops

Apache License 2.0
29 stars 21 forks source link

[BF16]For LayoutLMForSequenceClassification model on stock pytorch, index_select cost time on pvc-1100 worse than A100 * ratio #816

Open xiaowangintel opened 2 months ago

xiaowangintel commented 2 months ago

🐛 Describe the bug

For more details, please refer to https://jira.devtools.intel.com/browse/PYTORCHDGQ-5080.

Versions

pytorch commit:03480213dea1f60f6d12e7348904d2f3ef7314d0 torch-xpu-ops commit:718bc42c667539977e5eadb11ea4dec602544bf2 driver : hotfix_agama-ci-devel-881.19 pti : l_intel-pti-dev_p_0.9.0.38_offline.sh basekit : l_BaseKit_p_2024.2.1.100_offline.sh

majing921201 commented 2 months ago

fixed in https://github.com/intel/torch-xpu-ops/pull/924

fengyuan14 commented 1 month ago

Hi, @xiaowangintel Please retest the case.