PaddlePaddle / PaddleNLP

👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis etc.
https://paddlenlp.readthedocs.io
Apache License 2.0
11.98k stars 2.92k forks source link

部署nvidia-docker后报错Connection Error #3274

Closed Ywandung-Lyou closed 2 years ago

Ywandung-Lyou commented 2 years ago

按照快速体验在Linux部署nvidia-docker出现错误“Connection Error: Is pipelines running? An error occurred during the request”。之前在Win11上部署Docker没有。

w5688414 commented 2 years ago

按照快速体验在Linux部署nvidia-docker出现错误“Connection Error: Is pipelines running? An error occurred during the request”。之前在Win11上部署Docker没有。

确保cuda是10.2,如果需要高版本cuda的docker,请提issue。启动后一般要等待几分钟左右,请查看日志:

docker logs paddlenlp_pipelines > test.txt
Ywandung-Lyou commented 2 years ago

test.txt的内容如下:

INFO - pipelines.utils.import_utils -  Fetching from https://paddlenlp.bj.bcebos.com/applications/dureader_dev.zipto `data/dureader_dev` 
Traceback (most recent call last): 
  File "/usr/local/python3.7.0/lib/python3.7/site-packages/urllib3/connection.py", line 175, in _new_conn 
    (self._dns_host, self.port), self.timeout, **extra_kw 
  File "/usr/local/python3.7.0/lib/python3.7/site-packages/urllib3/util/connection.py", line 72, in create_connection 
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM): 
  File "/usr/local/python3.7.0/lib/python3.7/socket.py", line 748, in getaddrinfo 
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags): 
socket.gaierror: [Errno -3] Temporary failure in name resolution 

During handling of the above exception, another exception occurred: 

Traceback (most recent call last): 
  File "/usr/local/python3.7.0/lib/python3.7/site-packages/urllib3/connectionpool.py", line 710, in urlopen 
    chunked=chunked, 
  File "/usr/local/python3.7.0/lib/python3.7/site-packages/urllib3/connectionpool.py", line 386, in _make_request 
    self._validate_conn(conn) 
  File "/usr/local/python3.7.0/lib/python3.7/site-packages/urllib3/connectionpool.py", line 1040, in _validate_conn 
    conn.connect() 
  File "/usr/local/python3.7.0/lib/python3.7/site-packages/urllib3/connection.py", line 358, in connect 
    self.sock = conn = self._new_conn() 
  File "/usr/local/python3.7.0/lib/python3.7/site-packages/urllib3/connection.py", line 187, in _new_conn 
    self, "Failed to establish a new connection: %s" % e 
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x7f1c3e495390>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution 

During handling of the above exception, another exception occurred: 

Traceback (most recent call last): 
  File "/usr/local/python3.7.0/lib/python3.7/site-packages/requests/adapters.py", line 499, in send 
    timeout=timeout, 
  File "/usr/local/python3.7.0/lib/python3.7/site-packages/urllib3/connectionpool.py", line 786, in urlopen 
    method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2] 
  File "/usr/local/python3.7.0/lib/python3.7/site-packages/urllib3/util/retry.py", line 592, in increment 
    raise MaxRetryError(_pool, url, error or ResponseError(cause)) 
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='paddlenlp.bj.bcebos.com', port=443): Max retries exceeded with url: /applications/dureader_dev.zip (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f1c3e495390>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution')) 

During handling of the above exception, another exception occurred: 

Traceback (most recent call last): 
  File "utils/offline_ann.py", line 111, in <module> 
    output_dir=args.doc_dir) 
  File "/usr/local/python3.7.0/lib/python3.7/site-packages/pipelines-0.1.0a0-py3.7.egg/pipelines/utils/import_utils.py", line 87, in fetch_archive_from_http 
    request_data = requests.get(url, proxies=proxies) 
  File "/usr/local/python3.7.0/lib/python3.7/site-packages/requests/api.py", line 73, in get 
    return request("get", url, params=params, **kwargs) 
  File "/usr/local/python3.7.0/lib/python3.7/site-packages/requests/api.py", line 59, in request 
    return session.request(method=method, url=url, **kwargs) 
  File "/usr/local/python3.7.0/lib/python3.7/site-packages/requests/sessions.py", line 587, in request 
    resp = self.send(prep, **send_kwargs) 
  File "/usr/local/python3.7.0/lib/python3.7/site-packages/requests/sessions.py", line 701, in send 
    r = adapter.send(request, **kwargs) 
  File "/usr/local/python3.7.0/lib/python3.7/site-packages/requests/adapters.py", line 565, in send 
    raise ConnectionError(e, request=request) 
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='paddlenlp.bj.bcebos.com', port=443): Max retries exceeded with url: /applications/dureader_dev.zip (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f1c3e495390>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution')) 
w5688414 commented 2 years ago

请问您的环境能够连接外网吗?

docker 环境里面这个能下载下来吗?

wget https://paddlenlp.bj.bcebos.com/applications/dureader_dev.zip

https://blog.csdn.net/Oraclesand/article/details/76269280 另外, localhost是否畅通

Ywandung-Lyou commented 2 years ago

我现在把这2个镜像部署在内网的服务器上,内网服务器应该无法连接外网。 在内网环境下应该无法下载这个数据文件,但是我可以提前下载好并放到内网的服务器上。

w5688414 commented 2 years ago

我现在把这2个镜像部署在内网的服务器上,内网服务器应该无法连接外网。 在内网环境下应该无法下载这个数据文件,但是我可以提前下载好并放到内网的服务器上。

容器里面的源代码在:

/root/PaddleNLP/pipelines

源代码修改,请参考: https://github.com/PaddlePaddle/PaddleNLP/blob/b1ad85171e6258cc8d4facc2d73a7f110c9126bc/pipelines/utils/offline_ann.py#L26

然后手动执行即可,请参考dockerfile

https://github.com/PaddlePaddle/PaddleNLP/blob/develop/pipelines/docker/Dockerfile

ACbccc commented 1 year ago

按照快速体验在Linux部署nvidia-docker出现错误“Connection Error: Is pipelines running? An error occurred during the request”。之前在Win11上部署Docker没有。

确保cuda是10.2,如果需要高版本cuda的docker,请提issue。启动后一般要等待几分钟左右,请查看日志:

docker logs paddlenlp_pipelines > test.txt

您好,请问输入命令行之后没有反映,拉取不到日是什么情况,同样在UI页面显示connection error,cuda10.2+cudnn7.6.5+python3.9

fzg0202 commented 1 year ago

@Ywandung-Lyou 请问您的问题解决了吗,我出现了一样的问题