mdegans / weave

Branching story writing tool with generative AI
Other
1 stars 0 forks source link

Greatly improved LLaMA sampling defaults. #2

Closed mdegans closed 2 months ago

mdegans commented 2 months ago

By default drama_llama was using greedy sampling with no repetition penalty. This was a mistake in the implementation of Default for various settings structs. The default has now been changed to locally typical sampling with a minor repetition penalty. Quality of generation should be greatly improved.

Additionally, llama.cpp has been updated. This means any models will need to be updated since the tokenizer code has changed. The user will be warned in the terminal if that is the case.