zilliztech / GPTCache

Semantic cache for LLMs. Fully integrated with LangChain and llama_index.
https://gptcache.readthedocs.io
MIT License
6.89k stars 480 forks source link

[Bug]: Getting ONNX Runtime error #586

Closed ramchennuru closed 4 months ago

ramchennuru commented 7 months ago

Current Behavior

return self._sess.run(output_names, input_feed, run_options) onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Got invalid dimensions for input: token_type_ids for the following indices index: 1 Got: 7863 Expected: 512 Please fix either the inputs or the model.

Expected Behavior

My Code Is: url = "https://classic.clinicaltrials.gov/ct2/show/NCT04239443" r = requests.get(url) pagesource = r.text pagesource=pagesource[:20000] question = "Figure out the publication Date/Citation date from the given page source:"+pagesource

def response_text(openai_resp):
    return openai_resp['choices'][0]['message']['content']

start_time = time.time()
response = openai.ChatCompletion.create(
    model='gpt-4-1106-preview',
    messages=[
    {
    'role': 'user',
    'content': question
    }
    ],
)
print(f'Question: {question}')
print("Time consuming: {:.2f}s".format(time.time() - start_time))
print(f'Answer: {response_text(response)}\n')

It is showing Expected 512 but Got 7863 Tokens.How can I resolve this issue

Steps To Reproduce

No response

Environment

No response

Anything else?

No response

viktor-svirsky commented 6 months ago

The same here. onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Got invalid dimensions for input: token_type_ids for the following indices index: 1 Got: 1722 Expected: 512 Please fix either the inputs or the model.

SimFG commented 6 months ago

@viktor-svirsky You should make sure that number of the input token is less than 512, which is the limit of onnx embedding model