Closed probicheaux closed 3 weeks ago
I tried to build this image from this branch locally docker/dockerfiles/Dockerfile.paligemma -t "roboflow-paligemma" .
and the process failed with below error
9.563 ERROR: Could not find a version that satisfies the requirement onnxruntime-gpu<=1.15.1 (from versions: none)
9.563 ERROR: No matching distribution found for onnxruntime-gpu<=1.15.1
@grzegorz-roboflow by locally, do you mean on an m1/m2 mac?
ARM macs aren't supported by onnxruntime-gpu. You can't build docker/dockerfiles/Dockerfile.onnx.gpu
either.
You need to add --platform linux/amd64
to your docker build command
Description
There was some weird bug (at least on my machine) where multiple subsequent calls to paligemma would degrade performance, I tracked it down to mismatch between the pytorch installed cudnn and the system one. Pytorch says "oh don't even bother having yoru own cudnn", but we need it for onnx stuff. So we uninstall the pytorch installed cudnn.
I think the error came in the flash attention implementation or some improperly intialized tensor or something
Type of change
Please delete options that are not relevant.
How has this change been tested, please provide a testcase or example of how you tested the change?
Locally
Any specific deployment considerations
Depends on system cuda maybe
Docs