triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html
BSD 3-Clause "New" or "Revised" License
8k stars 1.44k forks source link

triton inference client pinned to geventhttpclient==2.0.2, cabundle doesn't support letsencrypt #5622

Open brightsparc opened 1 year ago

brightsparc commented 1 year ago

Description PR 185 pinned geventhttpclient==2.0.2 due to a potential change in ssl_context_factory handling.

The geventhttpclient releases seem to indicate that there shouldn't be any breaking changes for the handling of create_default_ssl_context in version 2.0.8

I've validated when upgrading to geventhttpclient==2.0.8 I am able to make requests to http and grpc endpoints with the tritonclient when passing ssl=True

Triton Information What version of Triton are you using? 2.32.0

Are you using the Triton container or did you build it yourself? Triton container

To Reproduce The following code when run will return SSLZeroReturnError when host is using letsencrypt generated cert.

from geventhttpclient import HTTPClient
from geventhttpclient.url import URL

# Following is the __init__ method of InferenceServerClient
# see: https://github.com/triton-inference-server/client/blob/main/src/python/library/tritonclient/http/__init__.py#L146
def connect(url,
             verbose=False,
             concurrency=1,
             connection_timeout=60.0,
             network_timeout=60.0,
             max_greenlets=None,
             ssl=False,
             ssl_options=None,
             ssl_context_factory=None,
             insecure=False):
    scheme = "https://" if ssl else "http://"
    _parsed_url = URL(scheme + url)
    return HTTPClient.from_url(
                _parsed_url,
                concurrency=concurrency,
                connection_timeout=connection_timeout,
                network_timeout=network_timeout,
                ssl_options=ssl_options,
                ssl_context_factory=ssl_context_factory,
                insecure=insecure)

client_stub = connect("letsencrypt-secured-host", ssl=True)

# Following is get method for metadata
# see: https://github.com/triton-inference-server/client/blob/main/src/python/library/tritonclient/http/__init__.py#L251

request_uri = "/v2/models/my-model"
headers =  {}
client_stub.get(request_uri, headers=headers)

Following is the full stack trace:

    789 def wrap_socket(sock, keyfile=None, certfile=None,
    790                 server_side=False, cert_reqs=CERT_NONE,
    791                 ssl_version=PROTOCOL_SSLv23, ca_certs=None,
    792                 do_handshake_on_connect=True,
    793                 suppress_ragged_eofs=True,
    794                 ciphers=None):
--> 796     return SSLSocket(sock=sock, keyfile=keyfile, certfile=certfile,
    797                      server_side=server_side, cert_reqs=cert_reqs,
    798                      ssl_version=ssl_version, ca_certs=ca_certs,
    799                      do_handshake_on_connect=do_handshake_on_connect,
    800                      suppress_ragged_eofs=suppress_ragged_eofs,
    801                      ciphers=ciphers)

File ~/mambaforge/envs/xxx/lib/python3.9/site-packages/gevent/_ssl3.py:312, in SSLSocket.__init__(self, sock, keyfile, certfile, server_side, cert_reqs, ssl_version, ca_certs, do_handshake_on_connect, family, type, proto, fileno, suppress_ragged_eofs, npn_protocols, ciphers, server_hostname, _session, _context)
    310 except socket_error as x:
    311     self.close()
--> 312     raise x

File ~/mambaforge/envs/xxx/lib/python3.9/site-packages/gevent/_ssl3.py:308, in SSLSocket.__init__(self, sock, keyfile, certfile, server_side, cert_reqs, ssl_version, ca_certs, do_handshake_on_connect, family, type, proto, fileno, suppress_ragged_eofs, npn_protocols, ciphers, server_hostname, _session, _context)
    305         if timeout == 0.0:
    306             # non-blocking
    307             raise ValueError("do_handshake_on_connect should not be specified for non-blocking sockets")
--> 308         self.do_handshake()
    310 except socket_error as x:
    311     self.close()

File ~/mambaforge/envs/xxx/lib/python3.9/site-packages/gevent/_ssl3.py:666, in SSLSocket.do_handshake(self)
    664 while True:
    665     try:
--> 666         self._sslobj.do_handshake()
    667         break
    668     except SSLWantReadError:

SSLZeroReturnError: TLS/SSL connection has been closed (EOF) (_ssl.c:1129)

Expected behavior The expected response, and the one received when we upgrade to 2.0.8 is:

<HTTPSocketPoolResponse status=200 headers={}>
krishung5 commented 1 year ago

Hi @brightsparc, thanks for bringing this up. I built the client with geventhttpclient==2.0.8 with the latest Triton version, and I'm still seeing the same issue as mentioned in the PR https://github.com/triton-inference-server/client/pull/185. If you'd like to try, the test we run to verify is L0_https. We have filed a ticket for upgrading the geventhttpclient version in our backlog.

brightsparc commented 1 year ago

Hi @brightsparc, thanks for bringing this up. I built the client with geventhttpclient==2.0.8 with the latest Triton version, and I'm still seeing the same issue as mentioned in the PR triton-inference-server/client#185. If you'd like to try, the test we run to verify is L0_https. We have filed a ticket for upgrading the geventhttpclient version in our backlog.

I'd like to help, I didn't see any specific errors in that PR, are you able to share your output.

I took a look at Testing Triton, but wasn't able to build the model repo on my local mac m1, will I require a GPU for this?

krishung5 commented 1 year ago

Yes, I believe it requires GPU for TensorRT models. However, the model used in L0_https test can be simply copied from https://github.com/triton-inference-server/server/tree/main/docs/examples/model_repository/simple. For reference, this is what we do to create the QA container: https://github.com/triton-inference-server/server/blob/main/Dockerfile.QA#L117-L118.

ClaytonJY commented 8 months ago

Is there a timeline on updating this geventhttpclient dependency?

I'm hitting a conflict with the locust load-testing package, the latest versions of which require geventhttpclient (>=2.0.11). This is a bit surprising as some Triton docs recommend this tool.

Alternatively, can I avoid needing geventhttpclient for tritonclient? While I do use tritonclient to make HTTP calls, I only use the asyncio client, not the gevent one.

giladd123 commented 2 months ago

Is there any update? Currently can't use the regular http client, only the aio one.