Open fan-niu opened 2 months ago
In general, the model has to be implemented inside vLLM's framework first before it can be used in vLLM. This is the trade-off between generalizability (i.e. directly using Transformers) and performance.
In general, the model has to be implemented inside vLLM's framework first before it can be used in vLLM. This is the trade-off between generalizability (i.e. directly using Transformers) and performance.
@DarkLight1337 thanks for your reply, currently vllm does not support AutoModelForSequenceClassification, right? If so, are there any plans to support this feature?
AutoModelForSequenceClassification
isn't a concrete model. I take it that you are asking for vLLM to support arbitrary models that inherit from it. This is rather difficult since we need some way to automatically translate Transformers models into vLLM framework. I wouldn't expect this to come any time soon.
AutoModelForSequenceClassification
isn't a concrete model. I take it that you are asking for vLLM to support arbitrary models that inherit from it. This is rather difficult since we need some way to automatically translate Transformers models into vLLM framework. I wouldn't expect this to come any time soon.
@DarkLight1337 In fact, what I want to use is LlamaForSequenceClassification. My model is trained based on llama3.1-8b. Is LlamaForSequenceClassification currently supported? Looking forward to your reply, thank you
You can refer to the framework proposed by https://github.com/vllm-project/vllm/pull/6260 to implement your own one. It's not supported otherwise.
You can refer to the framework proposed by #6260 to implement your own one. It's not supported otherwise.
@DarkLight1337 Ok, thanks very much, Already contacted the author of https://github.com/vllm-project/vllm/pull/6260
Your current environment
How would you like to use vllm
I want to run inference of a AutoModelForSequenceClassification. I don't know how to integrate it with vllm.
Before submitting a new issue...