I turned on chatmode by default for our models on HF to give KAI users a better experience as per Henk's suggestion, but our current inference code here needs it to be off since our models aren't well-supported in the stable version of KAI yet (which is what we use as an inference server ATM).
This PR fixes that by forcefully disabling chatmode when running inference.
I turned on
chatmode
by default for our models on HF to give KAI users a better experience as per Henk's suggestion, but our current inference code here needs it to be off since our models aren't well-supported in the stable version of KAI yet (which is what we use as an inference server ATM).This PR fixes that by forcefully disabling
chatmode
when running inference.