johnsmith0031 / alpaca_lora_4bit

MIT License
533 stars 84 forks source link

Query API from CLI for alpaca_lora_4bit_docker #69

Open aindilis opened 1 year ago

aindilis commented 1 year ago

Hi,

I'm trying to query the 7860 port (using example-api.py and example-api-streaming.py). Is there a way I can programmatically access from Python?

Thanks

andybarry commented 1 year ago

You can get a prompt inside the docker container with:

docker run --gpus=all -it --entrypoint /bin/bash alpaca_lora_4bit

Usually the docker container will run python server.py from there, but you can do whatever you want instead.

aindilis commented 1 year ago

Thanks @andybarry! I also managed to get a separate approach to work - for reference: to query the API from the host, I had to: install https://github.com/andybarry/alpaca_lora_4bit_docker , then had to run the docker with this command: docker run --gpus=all -p 7860:7860 -p 5000:5000 alpaca_lora_4bit Then must go to interface tab and select api and no_stream, and Apply and restart the interface. Then had to use curl curl -s http://0.0.0.0:5000/api/v1/generate -d '{"prompt":"Prompt here","lora":"None","model":"llama-7b-4bit"}'.

aindilis commented 1 year ago

Hi, I'd like to somehow set the docker to use no_stream=1 and api=1, I tried every method of passing those vars that I could think of, but none worked. e.g. -e api=1 -e no_stream=1 and -e CLI_ARGS="api=1 no_stream=1". I was wondering if there is an easy way, this way I can programmatically invoke this model without having to intervene in the interface. Thank you!