tcsenpai / spacellama

Do What The F*ck You Want To Public License
58 stars 4 forks source link

Remove restoring token limit on opening settings #3

Open jks-liu opened 1 month ago

jks-liu commented 1 month ago
  1. Behavior of setting limit to a default value every time of opening settings is very strange.
  2. The default value is not suitable for num_ctx parameter because it is the maximum supported value by model. For e.g. llama3.1, this value is more than 100k, passing it to num_ctx will decrease performance. (Default value for ollama is only 2048)
tcsenpai commented 1 month ago

Do you mean that in your PR we don't use the tokens table anymore and we just rely on the user's settings (or default)?

Regarding (2) I did not find any drawbacks on this, can you provide an example of why it would be detrimental to set num_ctx to 100k for example?

jks-liu commented 1 month ago
  1. Yes, I think token limit should be decided by users because users have different hardware.
  2. On my RTX 4090, llama3.1 will be slow if num_ctx is more than 10k.

More discussion about num_ctx: https://github.com/ollama/ollama/issues/4790 https://github.com/Mintplex-Labs/anything-llm/issues/1991 https://www.reddit.com/r/LocalLLaMA/comments/1dxi6cf/today_i_learned_the_context_length_num_ctx/

tcsenpai commented 1 month ago

Do you think it is still worth making the (2) edit now that I merged the other PR?