Remove restoring token limit on opening settings

tcsenpai / spacellama

Do What The F*ck You Want To Public License

58 stars 4 forks source link

Open jks-liu opened 1 month ago

jks-liu commented 1 month ago

Behavior of setting limit to a default value every time of opening settings is very strange.
The default value is not suitable for num_ctx parameter because it is the maximum supported value by model. For e.g. llama3.1, this value is more than 100k, passing it to num_ctx will decrease performance. (Default value for ollama is only 2048)

tcsenpai commented 1 month ago

Do you mean that in your PR we don't use the tokens table anymore and we just rely on the user's settings (or default)?

Regarding (2) I did not find any drawbacks on this, can you provide an example of why it would be detrimental to set num_ctx to 100k for example?

jks-liu commented 1 month ago

Yes, I think token limit should be decided by users because users have different hardware.
On my RTX 4090, llama3.1 will be slow if num_ctx is more than 10k.

tcsenpai commented 1 month ago

Do you think it is still worth making the (2) edit now that I merged the other PR?