What happens if temperature is null?

FellowTraveler commented 3 months ago

Is it possible to just leave temperature null and go with however the model is already configured? I have models already custom configured in Ollama with specific sampler settings and would like to test with those as they are, but I think the ailice code is setting my temperature to 0 and also doesn't let me set min_p. I would like option to leave sampler settings null and go with the defaults for the model as configured in ollama. Also I would like the option to individually set sampler settings for each individual model. I'm not asking you to work on this, but just letting you know for visibility. I might submit a PR myself if I get around to it.

stevenlu137 commented 3 months ago

Indeed, you can set the temperature in config.json to null, but the effect of this depends on the behavior of the Python package used to access the API (typically openai). If using the Litellm+Ollama approach, it also depends on whether litellm sets a default value for an unspecified temperature.

However, this parameter might not be that critical, as setting it directly to 0 could be the best choice for agents that require high reasoning abilities.

If you have other findings or modifications, please let me know or submit a PR:-)

FellowTraveler commented 2 months ago

I've been using the min_p sampler, which works effectively at higher temperatures. (I would still use a low temperature for coding though. Like 0.3 through 0.7). But with a min_p of 0.1 or 0.2 it works great. top_p top_k samplers min_p sampling comparison

stevenlu137 commented 2 months ago

Looks great, I'll give it a try! Can the problem be solved by modifying the temperature settings in config.json?

FellowTraveler commented 2 months ago

Yes, I can set the temperature settings in config.json, but I would also like the ability to set the other sampler settings. Mainly I want to set top_p to 1 (to turn it off) and I want to set top_k to 0 (to turn it off as well).

Then I want to be able to fiddle with temperature and min_p during my testing. Probably I would use a temperature of around 0.3 to 0.7 and a min_p of 0.05 to 0.2.

Probably I want to set the other sampler settings too, for example repeat_penalty can be bad for coding since coding is often repetitive. But depending on the model, I need some amount of penalty because otherwise some models just start spitting out the same thing over and over in an infinite loop.

===> But if I CAN'T set the sampler settings that I need to, then I prefer that AIlice does not fiddle with temperature or top_p either, as it currently does, since I already have my sampler set up how I like it in my Ollama modelfiles. So I prefer to either have the power to configure them in AIlice, or otherwise I prefer that AIlice doesn't change them at all. If AIlice is going to change them, then let me configure them in AIlice. As you can see above, min_p works much better than top_p and it works well at higher temperatures.

FellowTraveler commented 2 months ago

BTW see: min_p added to Ollama

stevenlu137 commented 2 months ago

I understand your point now.

From the perspective of the Single Responsibility Principle, it's more reasonable to have AIlice avoid setting sampling parameters, which allows us to integrate APIs from more inference services. However, since we've retained AModelCausalLM.py based on hf transformers for the convenience of developers, keeping a temperature setting is unavoidable.

We can disable AIlice's temperature setting by setting the default value to null in config.json. I've just submitted a change to ensure that when the temperature is None, AIlice won't pass any sampling parameters to the inference service (previously, AIlice would pass a None value for temperature, which could potentially cause issues).

stevenlu137 commented 2 months ago

Well, it seems that this is only valid for open source models running locally. For remote commercial models that are beyond the control of users, we still need to provide sampling parameter settings.

FellowTraveler commented 2 months ago

Yeah I also realized that whether or not you explicitly set the sampler settings in your config, there is never any guaranteeing whether or not some dependency is just setting its own default values if we don't set them explicitly. So we probably have to set them anyway just to guarantee what the values will actually be.

Furthermore, it's likely that we will want the flexibility to set the sampler settings for each LLM, in each different context that it is used. Like your sampler settings for Llama3-70b as a thinker model might be different than your sampler settings for Llama3-70b as a coding model. And if you switched to Deepseek-coder as your coder model, it might work better with different sampler settings when coding than Llama3 does. As I said I know some models need a different repeat penalty to prevent infinite loops, but also some models need it turned off for coding. Ultimately we need the flexibility to configure the sampler for every different use of an LLM because the sampler is critical to the results you get out of that LLM.

myshell-ai / AIlice

What happens if temperature is null? #50