How you manage the error in evaluate_main()?

Zapotecatl commented 4 months ago

Hi @MaggotHATE ,

Sorry, it is not a issue, it is a doubt. In the main example there is a part where if an error occurs the chat ends:

LOG("eval: %s\n", LOG_TOKENS_TOSTR_PRETTY(ctx, embd).c_str());

if (llama_decode(ctx, llama_batch_get_one(&embd[i], n_eval, n_past, 0))) {
     LOG_TEE("%s : failed to eval\n", __func__);
     return 1;
}

I saw, you use the function: evaluate_main(), int checkEmb(), but I don't understan how you manage the error.

https://github.com/MaggotHATE/Llama_chat/blob/55e8c9db7dbb879a1b6e04427afff8dc61133337/chat_plain.h#L1299

I mean, I don't understand in which cases the error occurs and how to avoid it or how manage it (my knwolege of llama is limited). In my app (a chat bot) at the moment I am using a work around in which if the error occurs within an external while(true) the chat is launched again and it gives the impression that the chat continues and the user does not perceive that the program ended abruptly.

So, please, could you explain to me how you manage the error?

Thanks for the help!

MaggotHATE commented 4 months ago

Hi! As far as I understand it, this cornercase happens when a non-typical element is present. llama_decode traces back to llama_decode_internal, but I haven't gone that far to study it. All I know for sure is that this particular case has never happened for me so far - at least with compatible models that I tried and with prompts that I used.

Ideally, for continuity you would want to skip such element and proceed, and/or maybe heal the invalid element if you assume it's a symbol. Again, that's just my guess. Hope it helps!

UPD. Please keep in mind that main example is quite old by itself, even with all updates to it. There may always be parts that are no longer necessary, or need more thorough approach.

Zapotecatl commented 4 months ago

Thank you very much for your quick response and taking your time. The error happens to me in Llama2 7B when I intentionally introduce very large and random prompts in my tests.

So, just to make sure I understood correctly: "for continuity you would want to skip such element and proceed"

In your project, if the error occurred is what you are doing?

Again, thank you very much!

MaggotHATE commented 4 months ago

The error happens to me in Llama2 7B when I intentionally introduce very large and random prompts in my tests.

Very interesting, this never happened to me with that model - however, it's been a long while since I tried pure Llama2 7b. What size prompts do you use? Are they within Llama2 7b maximum context?

Currently, if the error occurs, the existing context fails evaluation in checkEmb(). However, it doesn't stop there as checkEmb() itself never leads to termination.

Again, I've never encountered this error, so I didn't change this part of main. Keep in mind that my program is fully local, so it's expected that a user will simply relaunch it (or even better, restart the model).

Zapotecatl commented 4 months ago

"What size prompts do you use? Are they within Llama2 7b maximum context?"

It's a good point, to do "extreme" tests, I have included paragraphs from Wikipedia as entry prompts, and I print this:

printf("\nn_eval: %d, n_past: %d, i: %d, embd_size: %d, bat_n_tokens: %d\n", n_eval, n_past, i, (int)embd.size(), batch_all.n_tokens );

and I did not register any pattern in the values that would give me an indication of when the error occurs. I will check the size of the input promt also in embd_inp.

Thank you very much for your help! Success with your projects!

MaggotHATE commented 4 months ago

@Zapotecatl Hi again, sorry for ping, but just for clarity: I have finally seen the effect of this particular error: indeed, it happens when the size of input is greater than the model's context size. In my case, it was a large set of data I wanted to analyze with Hermes-2-Theta-Llama-3-8B. However, it is only a 8k context model, and the error started happening once the processed context was over that limit.

As such, it might be a model-specific problem, as going over the context size is managed automatically in main.

Zapotecatl commented 4 months ago

Thanks a lot, great!

MaggotHATE / Llama_chat

How you manage the error in evaluate_main()? #2