llama: add multiple prompt processing

alpaca-core / ac-local

Alpaca Core local inference SDK

MIT License

1 stars 0 forks source link

Open iboB opened 3 months ago

iboB commented 3 months ago

Intentionally skipped while implementing #3

This is done by setting llama_context_params::n_seq_max to a value greater than one.

Additionally we should devise a good way to expose this from both the module API and the SDK API.