vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
29.78k stars 4.49k forks source link

[Misc]: Can we remove `vllm/entrypoints/api_server.py`? #3852

Open hmellor opened 7 months ago

hmellor commented 7 months ago

Anything you want to discuss about vllm.

While gardening GitHub issues I've seen many issues where the user is using -m vllm.entrypoints.api_server, which indicates that users are either not aware of ignoring the note at the top of the file:

https://github.com/vllm-project/vllm/blob/b7782002e1da25de77e0b1890ff8b72dd4df917c/vllm/entrypoints/api_server.py#L1-L7

Can we remove it to avoid future confusion?

FlorianJoncour commented 7 months ago

It's probably a good idea.

However, it would probably be better to move it into a subfolder like vllm_example. On the other hand, I don't think it would be desirable for the OpenAI server to become the only implementation.

It's possible that in the future, it could be useful to create servers implementation of Mistral AI API or Anthropic.

hmellor commented 7 months ago

We can leave the openai server where it is so that in future if other proprietary APIs are added they can live adjacent to the openai directory.

DarkLight1337 commented 5 months ago

There exist outstanding issues which suggest that there are some bugs in that file.

Yet, #2858 clearly mentions that the file is not supposed to be further updated. @simon-mo what are the reasons behind this? If we want this file to serve as an example, IMO we should at least make sure that it is bug-free.

simon-mo commented 5 months ago

We can fix bug. We just shouldn't add more features to keep it with parity for OpenAI compatibility.

hmellor commented 5 months ago

Does vllm/entrypoints/api_server.py have any OpenAI compatibility?

DarkLight1337 commented 5 months ago

We can fix bug. We just shouldn't add more features to keep it with parity for OpenAI compatibility.

Perhaps we should move it to examples directory as @FlorianJoncour has suggested. Then we can add tests to ensure that the example is working without introducing new features.

TikZSZ commented 5 months ago

Maybe this could be kept for supporting features in beta/experimental or something that's not covered by open ai api spec.

For example this was the only way I found to send token ids directly to model instead of chat messages, which is helpful incase when people want to experiment with prompts without limitations.

For example mistral v3 function calling requires custom tokenization that isn't covered by chat template is one such example and I think it could be useful debugging other OSS models that come with their own prompt templates that sometimes may not be optimal for whatever reason.

Note: I might be missing the docs regarding sending tokens directly to open ai api spec but I didn't find anything regarding that.

simon-mo commented 5 months ago

Maybe this could be kept for supporting features in beta/experimental or something that's not covered by open ai api spec.

This is a valid point.

Perhaps we should move it to examples directory

We kept the file there for backward compatibility as some people are still using it.

DarkLight1337 commented 5 months ago

We kept the file there for backward compatibility as some people are still using it.

How long are we planning to maintain this? I think #4197 / #5090 would provide a good opportunity to drop it. We can also redirect imports/main to the new location of the file.

DarkLight1337 commented 5 months ago

Maybe this could be kept for supporting features in beta/experimental or something that's not covered by open ai api spec.

Correct me if I'm wrong, but it looks like we can't use this file directly to test out those features. If we have to copy this file and modify it elsewhere anyway, I don't see the harm in moving it to the examples and thus adding it to the official documentation.

github-actions[bot] commented 1 week ago

This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!