vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
31.05k stars 4.72k forks source link

using Falconlite2 for vllm benchmarking #7038

Open sadrafh opened 4 months ago

sadrafh commented 4 months ago

The model to consider.

the model that I am thinking is Falconlite2:https://huggingface.co/amazon/FalconLite2 I am not sure how to use vllm and falconlite2 together for benchmarking purposes

The closest model vllm already supports.

falcon7

What's your difficulty of supporting the model you want?

I want to use falconlite2 and vllm for benchmarking and get some latency and throughput results. However, falconlite2 is not supported by vllm

github-actions[bot] commented 4 weeks ago

This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!