Closed SabinStargem closed 11 months ago
If it's a GGUF model, it needs to be correctly configured by the model creator. KoboldCpp sets it based on the n_train_ctx value.
Going by what I am seeing in the terminal, the model places the 1,000,000 under "freq_base_train".
Hm. Maybe the Bloke isn't building it right? That said, you would think Ooga and other clients would have complaints about the model not working.
That's the wrong parameter. The value that is important for rope scaling is n_ctx_train, aka. the training context. On this model, it is 16k, so it will use rope scaling of 1.0 10000 for 16k contextsize, aka. a 1 to 1 ratio.
You can customize it with --ropeconfig, of course.
Can you find another model that uses freq_base_train? If it's a commonly set parameter, I can include it in my calculations. So I will use n_ctx_train unless freq_base_train is specified, which would overwrite it.
Here is Airoboros L2-70b, which has 10,000 as freq_base_train. Not sure if it is specific to the family. The value is probably generated from rope_theta, going by what I see in the config.json for the pytorches.
1,000,000
10,000
I found another 34b model, Synthia v1.2. It has freq_base_train 1,000,000 in Kobold's terminal. The output is similar to what Airoboros 34b has, if the ROPE isn't customized.
EDIT: Also tested CodeLlama 7b. Also makes garbage without tweaking the ROPE. You should be able to use that model for testing, since it is a 7b.
This should now be fixed, please try v1.48
Airoboros 34b successfully generated. :)
The ROPE is supposed to be 1,000,000. The defaults used in KoboldCPP is 10,000. Airoboros 34b repeats with KoboldCPP, unless the proper rope is used.