SciSharp / LLamaSharp

A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.
https://scisharp.github.io/LLamaSharp
MIT License
2.71k stars 350 forks source link

Non-Null Default `SamplingPipeline` #973

Closed martindevans closed 2 weeks ago

martindevans commented 3 weeks ago

Should resolve #971

En3Tho commented 2 weeks ago

To be honest I don't really get the design: I belive InferenceParams should be just a part of LlamaSharpPromtExecutionSettings so user can just provide all the necessary information.

I believe it's better to remove duplicating properties (like Temperature, etc) from LLamaSharpPromptExecutionSettings and just add InferenceParams there directly. It feels really weird that those things are recreated per request for no apparent reason. Also, not all of properties get translated there. So LLamaSharpPromptExecutionSettings ends up forcing user to use limited settings

This will also make overall usage more uniform: things you use in non-semantic-kernel environment could be transferred in a straightforward way and vise-versa

E.g. non-sematic kernel

var ex = Executor(...)
var inrerenceParams = ...
ex.Infer(inrerenceParams)

SemanticKernel =

var ex = Executor(...)
var inrerenceParams = ...
var promptExecutionSettings = (..., inrerenceParams)
var competion = (ex, settings)
completion.GetChatMessageContentAsync(promptExecutionSettings)

What do you think?

martindevans commented 2 weeks ago

I didn't write the original semantic kernel integration so I may be wrong here, but from looking at it I think LlamaSharpPromptExecutionSettings was written in the way it was to keep serialization simple.

What you suggest sounds like a nicer API to use though, and you could certainly adapt the converter (https://github.com/SciSharp/LLamaSharp/blob/master/LLama.SemanticKernel/LLamaSharpPromptExecutionSettingsConverter.cs) to serialize the more complex object.

Nobody is really "in charge" of developing the semantic kernel stuff at the moment, so PRs in this are are very welcome!