Open InconsolableCellist opened 1 year ago
I've faced the same situation, but I was able to get it working without using the model_backend and model path parameters. Instead, I did it manually through the web interface menu to open the model. and of course I selected the ExLlama backend on its menu.
Nvm.. I figured out a way. The issue with self.model_config.device_map.layers
, model_config being None is because it was never initialized from the beginning. The only place where it gets initialized is in the is_valid()
function within the KoboldAI\modeling\inference_models\exllama\class.py file. This is_valid()
function is called when a user opens the model through the web interface menu.
To fix this, I made a small change to the get_requested_parameters()
function in the class.py file. I added the following line at the very beginning:
if not self.model_config:
self.model_config = ExLlamaConfig(os.path.join(model_path, "config.json"))
However, it turns out there was one more thing that needed to be changed within the same function. I removed the square brackets from
"default": [layer_count if i == 0 else 0]
and changed it to
"default": layer_count if i == 0 else 0
This is needed because in the set_input_parameters()
function still in class.py file, on the line:
for i, l in enumerate(layers):
if l > 0:
that would treat it as a list instead of an integer.
Summary
It appears that self.model_config is None in ExLlama's class.py (https://github.com/0cc4m/KoboldAI/blob/exllama/modeling/inference_models/exllama/class.py#L423), and is assumed to exist when you get to that code via passing in --model_parameters.
Additionally, play.sh has an issue with parsing JSON use space as an IFS, which is how the
help
model param tells you to format your JSON.Steps to reproduce:
/play.sh --host --model airoboros-l2-70b-gpt4-2.0 --model_backend ExLlama --model_parameters "{'0_Layers':35,'1_Layers':45,'model_ctx':4096}"
Actual Behavior:
Output:
Expected Behavior:
The model parameters can be set at startup
Environment:
Additionally, the following change should be made in play.sh:
So that you can pass in JSON as the model params with spaces between the KV pairs, as the
help
parameter instructs you:INFO | main:general_startup:1395 - Running on Repo: https://github.com/0cc4m/KoboldAI.git Branch: exllama MESSAGE | Welcome to KoboldAI! MESSAGE | You have selected the following Model: airoboros-l2-70b-gpt4-2.0 ERROR | main:general_startup:1627 - Please pass through the parameters as a json like "{'[ID]': '[Value]', '[ID2]': '[Value]'}" using --model_parameters (required parameters shown below) ERROR | main:general_startup:1628 - Parameters (ID: Default Value (Help Text)): 0_Layers: [None] (The number of layers to put on NVIDIA GeForce RTX 3090.) 1_Layers: [0] (The number of layers to put on NVIDIA GeForce RTX 3090.) max_ctx: 2048 (The maximum context size the model supports) compress_emb: 1 (If the model requires compressed embeddings, set them here) ntk_alpha: 1 (NTK alpha value)