freedmand / semantra

Multi-tool for semantic search
MIT License
2.52k stars 140 forks source link

openai.error.InvalidRequestError: '$.input' is invalid. #54

Open endolith opened 1 year ago

endolith commented 1 year ago

I tried running it on ~50 files from a grep result, of types:

Pressing y for each file to be processed by openai was annoying, so I cancelled and tried again with --no-confirm and got this error. I then tried again without --no-confirm and still get the same error:

  File "C:\Users\endolith\anaconda3\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\endolith\anaconda3\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "c:\users\endolith\.local\bin\semantra.exe\__main__.py", line 7, in <module>
    try:
  File "C:\Users\endolith\.local\pipx\venvs\semantra\lib\site-packages\click\core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "C:\Users\endolith\.local\pipx\venvs\semantra\lib\site-packages\click\core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "C:\Users\endolith\.local\pipx\venvs\semantra\lib\site-packages\click\core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "C:\Users\endolith\.local\pipx\venvs\semantra\lib\site-packages\click\core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "C:\Users\endolith\.local\pipx\venvs\semantra\lib\site-packages\semantra\semantra.py", line 619, in main
    documents[fn] = process(
  File "C:\Users\endolith\.local\pipx\venvs\semantra\lib\site-packages\semantra\semantra.py", line 307, in process
    flush_pool()
  File "C:\Users\endolith\.local\pipx\venvs\semantra\lib\site-packages\semantra\semantra.py", line 272, in flush_pool
    embedding_results = model.embed(tokens, pool)
  File "C:\Users\endolith\.local\pipx\venvs\semantra\lib\site-packages\semantra\models.py", line 144, in embed
    response = openai.Embedding.create(model=self.model_name, input=texts)
  File "C:\Users\endolith\.local\pipx\venvs\semantra\lib\site-packages\openai\api_resources\embedding.py", line 33, in create
    response = super().create(*args, **kwargs)
  File "C:\Users\endolith\.local\pipx\venvs\semantra\lib\site-packages\openai\api_resources\abstract\engine_api_resource.py", line 153, in create
    response, _, api_key = requestor.request(
  File "C:\Users\endolith\.local\pipx\venvs\semantra\lib\site-packages\openai\api_requestor.py", line 298, in request
    resp, got_stream = self._interpret_response(result, stream)
  File "C:\Users\endolith\.local\pipx\venvs\semantra\lib\site-packages\openai\api_requestor.py", line 700, in _interpret_response
    self._interpret_response_line(
  File "C:\Users\endolith\.local\pipx\venvs\semantra\lib\site-packages\openai\api_requestor.py", line 763, in _interpret_response_line
    raise self.handle_error_response(
openai.error.InvalidRequestError: '$.input' is invalid. Please check the API reference: https://platform.openai.com/docs/api-reference.
endolith commented 1 year ago

Ah, it's a specific paper that's causing the error.

https://www.math.union.edu/~dpvc/papers/2001-01.DC-BG-BZ/DC-BG-BZ.pdf

semantra --port 1234 --model openai "DC-BG-BZ.pdf"

freedmand commented 1 year ago

Hmm, strange it's working for me. What version of Semantra are you running with (semantra --version)? I'm on 0.1.7 (you can update with pipx upgrade semantra)

endolith commented 1 year ago
λ semantra --version
0.1.7

λ pipx runpip semantra list
Package            Version
------------------ ------------
aiohttp            3.8.4
aiosignal          1.3.1
annoy-fixed        1.16.3
async-timeout      4.0.2
attrs              23.1.0
blinker            1.6.2
certifi            2023.5.7
charset-normalizer 3.1.0
click              8.1.3
colorama           0.4.6
filelock           3.12.2
Flask              2.3.2
frozenlist         1.3.3
fsspec             2023.6.0
huggingface-hub    0.16.2
idna               3.4
itsdangerous       2.1.2
Jinja2             3.1.2
MarkupSafe         2.1.3
mpmath             1.3.0
multidict          6.0.4
networkx           3.1
numpy              1.25.0
openai             0.27.8
packaging          23.1
Pillow             10.0.0
pip                23.2
pypdfium2          4.18.0
python-dotenv      1.0.0
PyYAML             6.0
regex              2023.6.3
requests           2.31.0
safetensors        0.3.1
semantra           0.1.7
setuptools         68.0.0
sympy              1.12
tiktoken           0.4.0
tokenizers         0.13.3
torch              2.0.1
torchaudio         2.0.2+cu117
torchvision        0.15.2+cu117
tqdm               4.65.0
transformers       4.30.2
typing_extensions  4.7.1
urllib3            2.0.3
Werkzeug           2.3.6
wheel              0.40.0
yarl               1.9.2
endolith commented 1 year ago

On another computer it works:

λ semantra --version
0.1.7

λ pipx runpip semantra list
Package            Version
------------------ --------
aiohttp            3.8.4
aiosignal          1.3.1
annoy-fixed        1.16.3
async-timeout      4.0.2
attrs              23.1.0
blinker            1.6.2
certifi            2023.5.7
charset-normalizer 3.1.0
click              8.1.3
colorama           0.4.6
filelock           3.12.2
Flask              2.3.2
frozenlist         1.3.3
fsspec             2023.6.0
huggingface-hub    0.16.2
idna               3.4
importlib-metadata 6.7.0
itsdangerous       2.1.2
Jinja2             3.1.2
MarkupSafe         2.1.3
mpmath             1.3.0
multidict          6.0.4
networkx           3.1
numpy              1.25.0
openai             0.27.8
packaging          23.1
Pillow             10.0.0
pip                23.2
pypdfium2          4.18.0
python-dotenv      1.0.0
PyYAML             6.0
regex              2023.6.3
requests           2.31.0
safetensors        0.3.1
semantra           0.1.7
setuptools         68.0.0
sympy              1.12
tiktoken           0.4.0
tokenizers         0.13.3
torch              2.0.1
tqdm               4.65.0
transformers       4.30.2
typing_extensions  4.7.1
urllib3            2.0.3
Werkzeug           2.3.6
wheel              0.40.0
yarl               1.9.2
zipp               3.15.0

Working computer has importlib-metadata 6.7.0 and zipp 3.15.0 and does not have torchaudio 2.0.2+cu117 or torchvision 0.15.2+cu117 (from https://github.com/freedmand/semantra/issues/36#issuecomment-1624544172)

Otherwise they are the same.