Open fdas3213 opened 4 months ago
Can you double-check that the CUDA version of pytorch matches that on your machine? There might be some incompatibilities that only surface when F.conv2d
is called with half-precision inputs (IIRC HuggingFace uses float32 by default).
@DarkLight1337 thanks for checking. The cuda version of pytorch is 11.8 which matches with nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0```
Can you run the model in normal precision?
@DarkLight1337 could you provide an example for how to run the model in normal precision? I just loaded the model and ran inference using the default setup, not sure where to specify precision
You can use the --dtype
argument as described here.
thanks @DarkLight1337 . apologize for the late response. Specifying float
or float32
gives another error
thanks @DarkLight1337 . apologize for the late response. Specifying
float
orfloat32
gives another error
Could you elaborate?
Apologize for missing the error log. It is a kernel crash error
21:10:02.243 [error] Disposing session as kernel process died ExitCode: undefined, Reason: 2024-08-01 21:06:40.427338: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
@youkaichao any thoughts?
This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!
Your current environment
🐛 Describe the bug
I was trying to run a basic multi-modal inference using llava-v1.6 using the below code
However, I am hitting the error below
I had no issue when running inference using huggingface