Closed fuhrriel closed 2 months ago
I cannot repro this issue. It works perfectly fine for me.
Can you send me a .json save with this issue?
Sure: saved_story.json
I've reset settings to default and changed only DRY mult. to 0.8 and added "—" to sequence breaks. Crash happens any time I try to generate more/retry.
I get hit with "[WinError -529697949] Windows Error 0xe06d7363" if i add the emdash in the kobold lite ui while using cublas Or "[WinError -1073741569] Windows Error 0xc00000ff" when using openblas. As soon as it's removed it functions again. Microsoft has no information relating to the cublas crash but the open blash crash is an ntstatus thing "STATUS_BAD_FUNCTION_TABLE" In debug it doesn't get past processing? "Processing 5 dry break strings...[WinError -1073741569] Windows Error 0xc00000ff" To be sure it wasn't the ui, it also runs into the same error in ST
It seems like DRY does not handle unicode strings correctly, it's calculating a substring of a unicode sequence (e.g. em dash is 226,128,148
, it finds a partial match and tries to tokenize the substring of the sequence 128,148
alone during the GetOverlappingTokenSequences
which is not a valid unicode character and causes a segfault.
I'm going to set it to skip overlap matches if the current sequence breaker is only extended unicode characters. Meanwhile @pi6am might want to take a look at this.
The hotfix is in https://github.com/LostRuins/koboldcpp/commit/c78690737c5ed92b59e06b81eb663d91ac55eb43
Should be fixed.
Describe the Issue Adding em dash "—" to the list of seq breakers for DRY causes a crash on first generation attempt. Issue occurs for me 100% of the time, regardless of model or preset used.
Additional Information:
"dry_multiplier": 0.8, "dry_base": 1.75, "dry_allowed_length": 2, "dry_penalty_last_n": 320, "dry_sequence_breakers": ["\n", ":", "\"", "*", "\u2014"]