Closed jackylu0124 closed 4 months ago
This issue should be resolved with the 0.3.0 release. I'll close this issue now. But please let us know if you still see the issue and we will re-open the investigation.
I confirm that this issue is now resolved with the 0.3.0 release.
I am running the
Phi-3-mini-4k-instruct-onnx
model on DirectML with the examplephi3-qa.py
script (https://github.com/microsoft/onnxruntime-genai/blob/main/examples/python/phi3-qa.py), and the only changes that I have made to the script is that I commented out the if-block settingsearch_options['max_length'] = 2048
so that it uses the maximum context length supported by the model (4096 in this case) by default. I encountered the issue when the input is longer than 2048 tokens, and more specifically, the following is my input prompt, and the error logs:Input Prompt:
Error Logs (parts of the exact paths are replaced with
xxx
for privacy reasons):I also checked these two issues, but they seem be different from the error that I am facing: https://github.com/microsoft/onnxruntime-genai/issues/424 https://github.com/microsoft/onnxruntime-genai/issues/521
Package Version:
onnxruntime-genai-directml 0.3.0rc2
GPU: RTX 3090