vespa-engine / vespa

AI + Data, online. https://vespa.ai
https://vespa.ai
Apache License 2.0
5.67k stars 592 forks source link

How to use AMD ROCM to improve embedding speed? #29754

Open 3CE8D2BAC65BDD6AA9 opened 9 months ago

3CE8D2BAC65BDD6AA9 commented 9 months ago

I have been trying the codes at:

https://github.com/vespa-engine/sample-apps/tree/master/multi-vector-indexing https://blog.vespa.ai/build-news-recommendation-app-from-python-with-vespa/

Everything works fine. Thank you! When I run the codes, I can see that the CPU is almost 100% utiilized but the GPU is idle. However, I have a AMD 7900 XTX running with ROCM installed. How can I use the GPU instead of the CPU to calculate the embedding?

Thanks in advance.

jobergum commented 9 months ago

Vespa currently only supports CPU and CUDA GPU so I'm afraid that currently Vespa can't use your AMD ROCm.

Vespa uses onnxruntime for inference so it's possible that we can add more execution providers ROCm