ELS-RD / transformer-deploy

Efficient, scalable and enterprise-grade CPU/GPU inference server for 🤗 Hugging Face transformer models 🚀
https://els-rd.github.io/transformer-deploy/
Apache License 2.0
1.64k stars 150 forks source link

convert_model command not found #173

Open pint1022 opened 1 year ago

pint1022 commented 1 year ago

Hello get docker image 0.6.0. Just tried to run the two demo command:

  1. docker run -it --rm --gpus all \ -v $PWD:/project ghcr.io/els-rd/transformer-deploy:0.6.0 \ bash -c "cd /project && \ convert_model -m \"philschmid/MiniLM-L6-H384-uncased-sst2\" \ --backend tensorrt onnx \ --seq-len 16 128 128" got the error: convert_model not found

  2. tried the 2nd command to use triton inference server. the service is started fine; the query command got the error: curl -X POST http://localhost:8000/v2/models/transformer_onnx_inference/versions/1/infer \ --data-binary "@demo/infinity/query_body.bin" \ --header "Inference-Header-Content-Length: 161" {"error":"Request for unknown model: 'transformer_tensorrt_inference' is not found"}

Does infinity download huggingface models and convert to Triton format?

thanks

sc0eur commented 1 year ago

I also got the convert_model not found.

I think pip3 install ".[GPU]" ... was lost somewhere in the latest Dockerfile update: link

I was able to run convert_model after manually running pip install ".[GPU]" inside the container

jingzhaoou commented 5 months ago

I ran into the same error and fixed it as suggested. That is, manually run pip install ".[GPU]" inside the container.

aidanrussell commented 4 months ago

This issue is still there? Can't someone fix it? EDIT: it seems this repository is not being maintained. It is a shame!