Closed danilyef closed 1 year ago
Hi @danilyef, thanks for the detailed report. For what I understand, it looks like it is not a InferenceClient
issue but something to fix on the docker configuration side. You need to give the llama_chat
container access to the gateway (i.e. 152.12.0.1:8080
), otherwise you won't be able to access it from within the container. That explains why the script works locally (python3 web.py
) but not in the docker.
To check the network access is correct, try running
import requests
response =requests.get("http://152.12.0.1:8080")
response.raise_for_status()
at the very beginning of the web.py
script. If it fails, it means the issue doesn't come from the python script itself but the docker config.
@Wauplin thank you for your quick response. I added in my script your code, then built and ran my docker on Ubuntu 18.04 again.
docker logs
didn't show anything.
I also entered the container and executed web.py script inside docker container and it works (see screenwhot)
But still unfortunately not accessible on local machine. TGI API is accessible.
What I have noticed: when I executed script inside docker I noticed, that port is 7861 which isn't default (it should be 7860)
I decided ti test other non-existing routes (like http://152.12.0.12:8080
). Container throws and error and stops as expected.
thank you for your quick response. I added in my script your code, then built and ran my docker on Ubuntu 18.04 again. docker logs didn't show anything.
Hmm, ok then it really have access to it.
What I have noticed: when I executed script inside docker I noticed, that port is 7861 which isn't default (it should be 7860)
Could that be the problem? By default Gradio will launch on 7860 but if it is taken, it will try 7861, 7862, 7863,...
@Wauplin I think it's not, because port 7860 is used already by container.
netstat -tuln
command shows me, that local adress 0.0.0.0:7860
is used (because container is running). But when I execute script inside container, it seems like it's taking another port.
When I run curl command: curl http://127.0.0.1:7860
in order to check connection I get the following error:
Recv failure: Connection reset by peer
So on your machine port 7860 is taken by the container because of docker run --security-opt seccomp:unconfined -p 7860:7860 -d llama_chat
which is normal. But if inside the container the Gradio app starts on port 7861, there must be a reason for it. Can you try with gr.ChatInterface(...).queue().launch(server_port=7860)
to explicitly force it to start on 7860 or raise an error?
If I execute web.py inside docker I am getting the following error:
Traceback (most recent call last):
File "/app/web.py", line 82, in <module>
gr.ChatInterface(
File "/usr/local/lib/python3.9/site-packages/gradio/blocks.py", line 2033, in launch
) = networking.start_server(
File "/usr/local/lib/python3.9/site-packages/gradio/networking.py", line 207, in start_server
raise OSError(
OSError: Cannot find empty port in range: 7860-7860. You can specify a different port by setting the GRADIO_SERVER_PORT environment variable or passing the `server_port` parameter to `launch()`.
Good, so this means this is the error. Any idea why this port is already in use in the container? Could you try
remove
# Make port 7860 available to the world outside this container
EXPOSE 7860
from the Dockerfile. From my understanding, this is not needed (+you are starting your container with -p 7860:7860
).
try to start app on a random port (7888?) and start the container with -p 7888:7888
. At least to be sure that nothing else is running on the port you want to use.
I fixed the problem by setting server_name to "0.0.0.0"
launch(server_name="0.0.0.0",server_port=7860)
Thank you guys for your quick responses, it guided me to the right direction:)
Good to hear! Wishing you a good continuation :hugs:
Describe the bug
I have succesfully deployed TGI (https://github.com/huggingface/text-generation-inference) with LLama-2 using the standard command:
TGI Backend for the model is running on the Ubuntu 18.04.
In order to test requests I used following commands:
or
where 152.12.0.1 is a Gateway of the Backend docker image.
where 152.20.147.36 is an Ubuntu server IP address.
You can access API for testing requests from your local machine:
http://152.20.147.36:8080/docs/#/
where 152.20.147.36 is an Ubuntu server IP address.
Every curl worked as expected.
What I want to do is to build a docker container for the web interface and run it on Ubuntu 18.04 (same environment as TGI), so that it can be accessible on the local machine (basically using 152.20.147.36:7860 IP address for accessing Web interface and port 7860). For this purposes I am using
InferenceClient
fromhuggingface_hub
. Here is my script:where 152.12.0.1 is a is a Gateway of the Backend docker image. When I run docker image there is nor error, but unfortunately I cannot access web interface on my local machine ( using 152.20.147.36:7860).
But If I start the script locally without docker (python3 web.py), everything works well an I can use web interface using
http://127.0.0.1:7860
route.Reproduction
My Dockerfile:
Docker commands:
requirements.txt
--security-opt seccomp:unconfined
is a necessary workaround for Ubuntu 18.04, because otherwise docker run will give you errors. (https://medium.com/nttlabs/ubuntu-21-10-and-fedora-35-do-not-work-on-docker-20-10-9-1cd439d9921)Logs
System info