Closed AshD closed 5 months ago
Hi @AshD, is this running the HelloPhi2 sample? (To help us with the repro)
Mostly the same code.
The exact code is here https://github.com/feiyun0112/SemanticKernel.Connectors.OnnxRuntimeGenAI
With the addition of generatorParams.TryGraphCaptureWithMaxBatchSize(1);
Hi @AshD,
The problem is on this line: generatorParams.SetSearchOption("past_present_share_buffer", onnxRuntimeGenAIPromptExecutionSettings.PastPresentShareBuffer);
When using graph capture (cuda_graph/dml_graph), PastPresentShareBuffer
should always be set to true
. This is something we'll make clearer or maybe even force for DML in future versions, but for now you should set it to true when using DML.
Thanks @PatriceVignola That fixes the issue :-)
Would appreciate some guidance on the discussion I opened https://github.com/microsoft/onnxruntime-genai/discussions/425
CPU version works fine with the corresponding model and nuget package. DirectML version throws the exception below.
Model: https://huggingface.co/microsoft/Phi-3-mini-128k-instruct-onnx/tree/main/directml/directml-int4-awq-block-128 Using Microsoft.ML.OnnxRuntimeGenAI.DirectML 0.2.0-rc6 nuget
I have called generatorParams.TryGraphCaptureWithMaxBatchSize(1); Max tokens is set to 4000
Exception thrown when calling generator.ComputeLogits();
Microsoft.ML.OnnxRuntimeGenAI.OnnxRuntimeGenAIException: 'Non-zero status code returned while running DmlFusedNode_0_0 node. Name:'DmlFusedNode_0_0' Status Message: D:\a_work\1\s\onnxruntime\core\framework\execution_frame.cc:173 onnxruntime::IExecutionFrame::GetOrCreateNodeOutputMLValue shape && tensor.Shape() == *shape was false. OrtValue shape verification failed. Current shape:{1,32,121,96} Requested shape:{1,32,4001,96}'
PC: Windows 11 Version 10.0.22631 Build 22631 Core i9 13th Gen, 128GB with RTX4090 with the latest Nvidia drivers.
Thanks, Ash