Open sadrafh opened 4 months ago
This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!
The model to consider.
the model that I am thinking is Falconlite2:https://huggingface.co/amazon/FalconLite2 I am not sure how to use vllm and falconlite2 together for benchmarking purposes
The closest model vllm already supports.
falcon7
What's your difficulty of supporting the model you want?
I want to use falconlite2 and vllm for benchmarking and get some latency and throughput results. However, falconlite2 is not supported by vllm