gkamradt / langchain-tutorials

Overview and tutorial of the LangChain Library
6.63k stars 1.92k forks source link

SSL Error in example Ask A Book Questions #20

Closed yamyamyuo closed 1 year ago

yamyamyuo commented 1 year ago

Hi there, thanks for solving my issue about loading PDF. I came across another issue and suspect it may relate to some python packages version.

I am trying Ask A Book Questions tutorial and get below error when executing this line: docsearch = Pinecone.from_texts([t.page_content for t in texts], embeddings, index_name=index_name)

Traceback (most recent call last):
  File "/Users/serena/Library/Python/3.9/lib/python/site-packages/urllib3/connectionpool.py", line 670, in urlopen
    httplib_response = self._make_request(
  File "/Users/serena/Library/Python/3.9/lib/python/site-packages/urllib3/connectionpool.py", line 381, in _make_request
    self._validate_conn(conn)
  File "/Users/serena/Library/Python/3.9/lib/python/site-packages/urllib3/connectionpool.py", line 978, in _validate_conn
    conn.connect()
  File "/Users/serena/Library/Python/3.9/lib/python/site-packages/urllib3/connection.py", line 362, in connect
    self.sock = ssl_wrap_socket(
  File "/Users/serena/Library/Python/3.9/lib/python/site-packages/urllib3/util/ssl_.py", line 386, in ssl_wrap_socket
    return context.wrap_socket(sock, server_hostname=server_hostname)
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/ssl.py", line 500, in wrap_socket
    return self.sslsocket_class._create(
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/ssl.py", line 1040, in _create
    self.do_handshake()
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/ssl.py", line 1309, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLError: [SSL: UNEXPECTED_RECORD] unexpected record (_ssl.c:1129)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/serena/Library/Python/3.9/lib/python/site-packages/requests/adapters.py", line 489, in send
    resp = conn.urlopen(
  File "/Users/serena/Library/Python/3.9/lib/python/site-packages/urllib3/connectionpool.py", line 726, in urlopen
    retries = retries.increment(
  File "/Users/serena/Library/Python/3.9/lib/python/site-packages/urllib3/util/retry.py", line 446, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by SSLError(SSLError(1, '[SSL: UNEXPECTED_RECORD] unexpected record (_ssl.c:1129)')))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/serena/Documents/langchain-tutorials/data_generation/chatPDF.py", line 33, in <module>
    docsearch = Pinecone.from_texts([t.page_content for t in texts], embeddings, index_name=index_name)
  File "/Users/serena/Library/Python/3.9/lib/python/site-packages/langchain/vectorstores/pinecone.py", line 235, in from_texts
    embeds = embedding.embed_documents(lines_batch)
  File "/Users/serena/Library/Python/3.9/lib/python/site-packages/langchain/embeddings/openai.py", line 269, in embed_documents
    return self._get_len_safe_embeddings(texts, engine=self.deployment)
  File "/Users/serena/Library/Python/3.9/lib/python/site-packages/langchain/embeddings/openai.py", line 188, in _get_len_safe_embeddings
    encoding = tiktoken.model.encoding_for_model(self.model)
  File "/Users/serena/Library/Python/3.9/lib/python/site-packages/tiktoken/model.py", line 75, in encoding_for_model
    return get_encoding(encoding_name)
  File "/Users/serena/Library/Python/3.9/lib/python/site-packages/tiktoken/registry.py", line 63, in get_encoding
    enc = Encoding(**constructor())
  File "/Users/serena/Library/Python/3.9/lib/python/site-packages/tiktoken_ext/openai_public.py", line 64, in cl100k_base
    mergeable_ranks = load_tiktoken_bpe(
  File "/Users/serena/Library/Python/3.9/lib/python/site-packages/tiktoken/load.py", line 114, in load_tiktoken_bpe
    contents = read_file_cached(tiktoken_bpe_file)
  File "/Users/serena/Library/Python/3.9/lib/python/site-packages/tiktoken/load.py", line 46, in read_file_cached
    contents = read_file(blobpath)
  File "/Users/serena/Library/Python/3.9/lib/python/site-packages/tiktoken/load.py", line 24, in read_file
    return requests.get(blobpath).content
  File "/Users/serena/Library/Python/3.9/lib/python/site-packages/requests/api.py", line 73, in get
    return request("get", url, params=params, **kwargs)
  File "/Users/serena/Library/Python/3.9/lib/python/site-packages/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
  File "/Users/serena/Library/Python/3.9/lib/python/site-packages/requests/sessions.py", line 587, in request
    resp = self.send(prep, **send_kwargs)
  File "/Users/serena/Library/Python/3.9/lib/python/site-packages/requests/sessions.py", line 701, in send
    r = adapter.send(request, **kwargs)
  File "/Users/serena/Library/Python/3.9/lib/python/site-packages/requests/adapters.py", line 563, in send
    raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by SSLError(SSLError(1, '[SSL: UNEXPECTED_RECORD] unexpected record (_ssl.c:1129)')))

Appreciate your help in advance!

gkamradt commented 1 year ago

Haven't seen this one before and I imagine it's on your end w/ your connection. I gotta close this one out because I don't believe it's part of this code. Have you asked gpt4?

yamyamyuo commented 1 year ago

Thanks @gkamradt, it does related to network issue with openAI. It works after correct the network. BTW, do you know how to initiate the pinecone index? Currently I initiate an index with cosine metric, 1536 dimension which indicated by the error message.

HTTP response headers: HTTPHeaderDict({'content-type': 'application/json', 'date': 'Sun, 14 May 2023 07:57:46 GMT', 'x-envoy-upstream-service-time': '2', 'content-length': '103', 'server': 'envoy'})
HTTP response body: {"code":3,"message":"Vector dimension 1536 does not match the dimension of the index 128","details":[]}