Open asmirnov82 opened 3 days ago
There is a DecodeAsync
method in the LlamaContext
which should be a "drop in" replacement for Decode
in an async context. Would you be interested in putting together a PR updating the three executors (Interact, Instruct and Stateless) to use this?
Yes, I'll do
Description
I am developing WPF application that uses LLamaSharp library and particulary LLama Executores (like
InstructExecutor
andInteractiveExecutor
). I expect that codedoesn't block my UI thread. However, UI freezes.
Looks, that this happens, because InferAsync awaits
InferInternal(inferenceParams, args)
method. And InferInternal implementations in InstructExecutor and InteractiveExecutor classes are synchronous.As an experiment I changed the
var (result, _) = Context.NativeHandle.Decode(_embeds, LLamaSeqId.Zero, batch, ref _pastTokensCount);
line in InstructExecutor tovar (result, _) = await Task.Run(() => Context.NativeHandle.Decode(_embeds, LLamaSeqId.Zero, batch, ref _pastTokensCount));
and this solved the issue.Do you have any plans to add Async implementations for all methods that are awaited by StatefulExecutorBase in all inhereted executors?