T4 GPUs support? Any recommendation

Hello,

First and foremost, thank you for the commendation on our work and paper. I've been attempting to run Evo locally on T4 GPUs, but I encountered an issue with FlashAttn 2.0 not being supported yet. I have a few questions regarding this:

Do you have any plans to support T4 GPUs in the near future? Will a single 16GB T4 GPU be sufficient for inference? If not, can we implement some optimization processes (with deepspeed) for Hugging Face models? Is there a way to use FlashAttn 1.x versions, or can we disable Flash-Attn usage directly? Is it possible to use float16 rather than bfloat16?

Thank you,

evo-design / evo

T4 GPUs support? Any recommendation #50