Closed Alxemade closed 4 months ago
hi, Great work! I tried this script huggingface-transformers, but found that the inference speed is much slower than the llava series. Do you have any relevant speed tests there?
You may install flash_attn and try again.
flash_attn
Close the issue for now if there's no further discussions. Feel free to reopen it if there's any other questions.
hi, Great work! I tried this script huggingface-transformers, but found that the inference speed is much slower than the llava series. Do you have any relevant speed tests there?