Closed martindevans closed 2 weeks ago
To be honest I don't really get the design:
I belive InferenceParams
should be just a part of LlamaSharpPromtExecutionSettings
so user can just provide all the necessary information.
I believe it's better to remove duplicating properties (like Temperature, etc) from LLamaSharpPromptExecutionSettings
and just add InferenceParams
there directly. It feels really weird that those things are recreated per request for no apparent reason. Also, not all of properties get translated there. So LLamaSharpPromptExecutionSettings
ends up forcing user to use limited settings
This will also make overall usage more uniform: things you use in non-semantic-kernel environment could be transferred in a straightforward way and vise-versa
E.g. non-sematic kernel
var ex = Executor(...)
var inrerenceParams = ...
ex.Infer(inrerenceParams)
SemanticKernel =
var ex = Executor(...)
var inrerenceParams = ...
var promptExecutionSettings = (..., inrerenceParams)
var competion = (ex, settings)
completion.GetChatMessageContentAsync(promptExecutionSettings)
What do you think?
I didn't write the original semantic kernel integration so I may be wrong here, but from looking at it I think LlamaSharpPromptExecutionSettings
was written in the way it was to keep serialization simple.
What you suggest sounds like a nicer API to use though, and you could certainly adapt the converter (https://github.com/SciSharp/LLamaSharp/blob/master/LLama.SemanticKernel/LLamaSharpPromptExecutionSettingsConverter.cs) to serialize the more complex object.
Nobody is really "in charge" of developing the semantic kernel stuff at the moment, so PRs in this are are very welcome!
SamplingPipeline
default to aDefaultSamplingPipeline
Should resolve #971