mobiusml / aana_sdk

Apache License 2.0
4 stars 1 forks source link

[ENHANCEMENT] Update vLLM #123

Closed movchan74 closed 9 hours ago

movchan74 commented 1 week ago

Enhancement Description

Update the vLLM version from 0.3.2 to the latest available version. This update is necessary to support the phi-3 mini model, which is only compatible with vLLM 0.4.3 and later. The current deployment does not support this model due to the outdated vLLM version. A quick upgrade attempt was made but failed, possibly due to issues with numpy 2.0, which is not backward compatible with numpy 1.x.

Advantages

Possible Implementation

Modify the project requirements to request the latest vLLM version. Possibly add numpy<2 to avoid compatibility issues with the latest numpy version.

movchan74 commented 5 days ago

One extra thing I would suggest to do for this issue is to add engine_args to vllm deployment config. engine_args: CustomConfig where CustomConfig is https://github.com/mobiusml/aana_sdk/blob/main/aana/core/models/custom_config.py

The reason is that some models like Phi-3 require extra args that are not in the config. For example, Phi-3 needs trust_remote_code=True but we don't have trust_remote_code in the config. There are a lot of extra option, I wouldn't add them all to the vllm deployment config but we can pass them through custom dict parameter. We already do it in the HF Pipeline deployment.