SciSharp / LLamaSharp

A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.
https://scisharp.github.io/LLamaSharp
MIT License
2.48k stars 331 forks source link

[BUG]: Offset and length were out of bounds #766

Open m0nsky opened 3 months ago

m0nsky commented 3 months ago

Description

I'm building a llava application. When the amount of tokens in my initial prompt is bigger than the batch size, the InteractiveExecutor will throw a:

System.ArgumentException: Offset and length were out of bounds for the array or count is greater than the number of elements from index to the end of the source collection.
   at System.Collections.Generic.List`1.GetRange(Int32 index, Int32 count)
   at LLama.InteractiveExecutor.InferInternal(IInferenceParams inferenceParams, InferStateArgs args) in C:\RiderProjects\llava_defender\LLama\LLamaInteractExecutor.cs:line 257
   at LLama.StatefulExecutorBase.InferAsync(String text, IInferenceParams inferenceParams, CancellationToken cancellationToken)+MoveNext() in C:\RiderProjects\llava_defender\LLama\LLamaExecutorBase.cs:line 325

When adding a breakpoint to LLamaInteractExecutor line 257, we can observe the following:

relevant breakpoint

My initial prompt is 1067 tokens (I have tokenized it and counted it) and the embed image is at position 1055 (which is somewhere at the end of my prompt), but _embeds only goes to 512 (the batch size).

Reproduction Steps

Environment & Configuration

Known Workarounds

martindevans commented 2 months ago

Since #761 the BatchedExecutor will automatically split work up into multiple batches (so any size prompt can be handled, you just need to call Infer() enough times to process the entire queue of work) and since #770 BatchedExecutor has had LLava support.