0.8.0a generation crashes the app

GameOverFlowChart commented 1 month ago

I will need more time to do different tests pointing out what exactly makes the app crash but it does during generation (the first response). I had a charactercard with long text which crashed before responding anything so first I thought the crash happens before the model outputs anything or as soon as it tries to output. But that's wrong, I tested with an character card that's empty and I got 1 word before the crash. The last version that worked for me was something likev0.7.10 I think, I didn't had time to test the new betas (the version without rwkv support). The model and instruct settings and everything are unchanged and did work before. The phone also seems to get slow before the crash (didn't happen with older versions).

Edit: By the way I like the ui changes, so it's really sad that this version doesn't work for me.

Vali-98 commented 1 month ago

Interesting, I'll give this a check down the line. I'm taking a short hiatus after 0.8.0. Sorry about the issues!

jupiterbjy commented 1 month ago

Immediate crash on every model I own & can't see anything meaningful on log.

Considering your break, should I be better pull back and go back to 0.7.x? I seriously love this new UI, gonna miss it!

EDIT: nvm, seems like deleting entire app data fixed it(migration from beta3 to .a) so migration was the curprit I suppose.

Vali-98 commented 4 weeks ago

EDIT: nvm, seems like deleting entire app data fixed it(migration from beta3 to .a) so migration was the curprit I suppose.

Interesting. I can't check right now, but I presume there is some invalid field that didn't migrate properly.

@GameOverFlowChart , try give the instruct and sampler settings a little shake, perhaps some of the new fields like the xtc settings or last_output_prefix breaks it.

GameOverFlowChart commented 4 weeks ago

EDIT: nvm, seems like deleting entire app data fixed it(migration from beta3 to .a) so migration was the curprit I suppose.

Interesting. I can't check right now, but I presume there is some invalid field that didn't migrate properly.

@GameOverFlowChart , try give the instruct and sampler settings a little shake, perhaps some of the new fields like the xtc settings or last_output_prefix breaks it.

I can try a bit, but I was already surprised why xtc was mentioned as a new local feature wasn't that already a thing before 0.8? I was already using it and if it wasn't a placebo I had good results with it.

Vali-98 commented 3 weeks ago

but I was already surprised why xtc was mentioned as a new local feature wasn't that already a thing before 0.8

Technically it was in one of the beta versions yes, but the implementation was my own, and apparently was inaccurate to how xtc was supposed to work, so I suppose the llama.cpp implementation is 'new' in a sense.

Any reports about the current crash?

GameOverFlowChart commented 3 weeks ago

but I was already surprised why xtc was mentioned as a new local feature wasn't that already a thing before 0.8

Technically it was in one of the beta versions yes, but the implementation was my own, and apparently was inaccurate to how xtc was supposed to work, so I suppose the llama.cpp implementation is 'new' in a sense.

Any reports about the current crash?

I changed the sampler to the default preset. No more crashes but the output is really weird. I still need to test to return step by step to my sampler.

GameOverFlowChart commented 3 weeks ago

I found the problem, using tail free sampling makes the app crash after like the first or so token. Setting tail free to 1 makes it work again.

@jupiterbjy can you confirm?

jupiterbjy commented 3 weeks ago

@GameOverFlowChart yup if I use anything other than 1.0 it crashes after few token.

my case was a bit different that it just immediately crashed without any output but must be similar thing - some kind of setting migration issue, which deleting app data fixes

Vali-98 commented 3 weeks ago

That's an interesting bug, even llama.rn has removed tfs_z on android for whatever reason:

https://github.com/mybigday/llama.rn/commit/a61a8575d86e91a044d5a788ba913a3781c65754

For the meantime, I'll just remove tfs_z from the local sampler options. I don't think anyone uses it anyways.

GameOverFlowChart commented 3 weeks ago

For the meantime, I'll just remove tfs_z from the local sampler options. I don't think anyone uses it anyways.

Except me 😂 but well I try to find something new that works for me. I have good experience with mixing different sampler options rather than sticking to 1-2. It did work before so the bug must be with llamacpp right?

Vali-98 commented 3 weeks ago

It did work before so the bug must be with llamacpp right?

Apparently its deprecated on latest: https://github.com/ggerganov/llama.cpp/pull/10071

This is somewhat agreeable, as tfs_z' sampling is difficult to discern, if it even did anything at all.

As such, ChatterUI will be permenantly removing tfs_z as well, alongside cui-llama.rn when I get around to it.

GameOverFlowChart commented 3 weeks ago

@Vali-98 Yeah if it's not in llamacpp I don't expect it from chatterui. I don't want to open a new issue for this but the outputs are really weird since the update to the new version. I tried to compensate for the lack of tail free sampling but I don't even know if that's it. Maybe your old and apparently wrong implementation of xtc had a positive impact? Using that xtc had improved my outputs a lot for sure much more human like conversations, and now it's gone and my outputs are bad. Either there is another bug, or I need to find new optimal sampler settings or you invented by accident a great sampling method.

Edit: Never mind, I found the problem, see #107

Vali-98 / ChatterUI

0.8.0a generation crashes the app #97