Open fdas3213 opened 1 month ago
Can you double-check that the CUDA version of pytorch matches that on your machine? There might be some incompatibilities that only surface when F.conv2d
is called with half-precision inputs (IIRC HuggingFace uses float32 by default).
@DarkLight1337 thanks for checking. The cuda version of pytorch is 11.8 which matches with nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0```
Can you run the model in normal precision?
@DarkLight1337 could you provide an example for how to run the model in normal precision? I just loaded the model and ran inference using the default setup, not sure where to specify precision
You can use the --dtype
argument as described here.
thanks @DarkLight1337 . apologize for the late response. Specifying float
or float32
gives another error
thanks @DarkLight1337 . apologize for the late response. Specifying
float
orfloat32
gives another error
Could you elaborate?
Apologize for missing the error log. It is a kernel crash error
21:10:02.243 [error] Disposing session as kernel process died ExitCode: undefined, Reason: 2024-08-01 21:06:40.427338: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
@youkaichao any thoughts?
Your current environment
🐛 Describe the bug
I was trying to run a basic multi-modal inference using llava-v1.6 using the below code
However, I am hitting the error below
I had no issue when running inference using huggingface