optional vllm microservice container build

Description

This PR adds optional build-arg's for this Dockerfile: comps/llms/text-generation/vllm/docker/Dockerfile.microservice

Issues

Current dockerfile always defaults to gpu Torch installation while CPU users might not necessarily need extra packages. This leads to smaller containers for CPU users and faster build.

For users interested to do gpu builds all they need to do is build this image this way:

docker build --build-arg ARCH='gpu' -f comps/llms/text-generation/vllm/docker/Dockerfile.microservice . -t opea/llm-vllm:latest

Type of change

List the type of change like below. Please delete options that are not relevant.

extra build-arg option ARCH to build vllm microservice container for either CPU or GPU.

Dependencies

None

opea-project / GenAIComps

optional vllm microservice container build #266

Description

Issues

Type of change

Dependencies