Open hmellor opened 7 months ago
It's probably a good idea.
However, it would probably be better to move it into a subfolder like vllm_example. On the other hand, I don't think it would be desirable for the OpenAI server to become the only implementation.
It's possible that in the future, it could be useful to create servers implementation of Mistral AI API or Anthropic.
We can leave the openai server where it is so that in future if other proprietary APIs are added they can live adjacent to the openai directory.
There exist outstanding issues which suggest that there are some bugs in that file.
Yet, #2858 clearly mentions that the file is not supposed to be further updated. @simon-mo what are the reasons behind this? If we want this file to serve as an example, IMO we should at least make sure that it is bug-free.
We can fix bug. We just shouldn't add more features to keep it with parity for OpenAI compatibility.
Does vllm/entrypoints/api_server.py
have any OpenAI compatibility?
We can fix bug. We just shouldn't add more features to keep it with parity for OpenAI compatibility.
Perhaps we should move it to examples
directory as @FlorianJoncour has suggested. Then we can add tests to ensure that the example is working without introducing new features.
Maybe this could be kept for supporting features in beta/experimental or something that's not covered by open ai api spec.
For example this was the only way I found to send token ids
directly to model instead of chat messages, which is helpful incase when people want to experiment with prompts without limitations.
For example mistral v3 function calling requires custom tokenization that isn't covered by chat template is one such example and I think it could be useful debugging other OSS models that come with their own prompt templates that sometimes may not be optimal for whatever reason.
Note: I might be missing the docs regarding sending tokens directly to open ai api spec but I didn't find anything regarding that.
Maybe this could be kept for supporting features in beta/experimental or something that's not covered by open ai api spec.
This is a valid point.
Perhaps we should move it to examples directory
We kept the file there for backward compatibility as some people are still using it.
We kept the file there for backward compatibility as some people are still using it.
How long are we planning to maintain this? I think #4197 / #5090 would provide a good opportunity to drop it. We can also redirect imports/main to the new location of the file.
Maybe this could be kept for supporting features in beta/experimental or something that's not covered by open ai api spec.
Correct me if I'm wrong, but it looks like we can't use this file directly to test out those features. If we have to copy this file and modify it elsewhere anyway, I don't see the harm in moving it to the examples and thus adding it to the official documentation.
This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!
Anything you want to discuss about vllm.
While gardening GitHub issues I've seen many issues where the user is using
-m vllm.entrypoints.api_server
, which indicates that users are either not aware of ignoring the note at the top of the file:https://github.com/vllm-project/vllm/blob/b7782002e1da25de77e0b1890ff8b72dd4df917c/vllm/entrypoints/api_server.py#L1-L7
Can we remove it to avoid future confusion?