vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
30.26k stars 4.58k forks source link

[Usage]: How to use AutoModelForSequenceClassification correctly #8459

Open fan-niu opened 2 months ago

fan-niu commented 2 months ago

Your current environment

nvidia A100 GPU
vllm 0.6.0

How would you like to use vllm

I want to run inference of a AutoModelForSequenceClassification. I don't know how to integrate it with vllm.

Before submitting a new issue...

DarkLight1337 commented 2 months ago

In general, the model has to be implemented inside vLLM's framework first before it can be used in vLLM. This is the trade-off between generalizability (i.e. directly using Transformers) and performance.

fan-niu commented 2 months ago

In general, the model has to be implemented inside vLLM's framework first before it can be used in vLLM. This is the trade-off between generalizability (i.e. directly using Transformers) and performance.

@DarkLight1337 thanks for your reply, currently vllm does not support AutoModelForSequenceClassification, right? If so, are there any plans to support this feature?

DarkLight1337 commented 2 months ago

AutoModelForSequenceClassification isn't a concrete model. I take it that you are asking for vLLM to support arbitrary models that inherit from it. This is rather difficult since we need some way to automatically translate Transformers models into vLLM framework. I wouldn't expect this to come any time soon.

fan-niu commented 2 months ago

AutoModelForSequenceClassification isn't a concrete model. I take it that you are asking for vLLM to support arbitrary models that inherit from it. This is rather difficult since we need some way to automatically translate Transformers models into vLLM framework. I wouldn't expect this to come any time soon.

@DarkLight1337 In fact, what I want to use is LlamaForSequenceClassification. My model is trained based on llama3.1-8b. Is LlamaForSequenceClassification currently supported? Looking forward to your reply, thank you

DarkLight1337 commented 2 months ago

You can refer to the framework proposed by https://github.com/vllm-project/vllm/pull/6260 to implement your own one. It's not supported otherwise.

fan-niu commented 2 months ago

You can refer to the framework proposed by #6260 to implement your own one. It's not supported otherwise.

@DarkLight1337 Ok, thanks very much, Already contacted the author of https://github.com/vllm-project/vllm/pull/6260