ValueError: Invalid `cache_implementation` (dynamic). Choose one of: ['static', 'offloaded_static', 'sliding_window', 'hybrid', 'mamba', 'quantized', 'static']

AzinY commented 1 month ago

After FT "AnatoliiPotapov/T-lite-instruct-0.1" model, I tried to generate something via:

model = FastLanguageModel.for_inference(model) messages = [ {"role": "from","from": "human", "value": "Is 9.11 larger than 9.9?"}, ] inputs = tokenizer.apply_chat_template( messages, tokenize=True, add_generation_prompt=True, return_tensors="pt" ).to("cuda")

textstreamer = TextStreamer(tokenizer) = model.generate(input_ids=inputs , streamer=text_streamer , max_new_tokens=128)

But it raised: ValueError: Invalid cache_implementation (dynamic). Choose one of: ['static', 'offloaded_static', 'sliding_window', 'hybrid', 'mamba', 'quantized', 'static']

Adding cache_implementation = 'WHATEVER from the list above'. doesn't change anything!

Please help)))

xjohnxjohn commented 1 month ago

Hi, Please update unsloth library. pip install -U unsloth

danielhanchen commented 1 month ago

@AzinY Yep please reinstall and update Unsloth - sorry on the issue!

pip uninstall unsloth -y
pip install --upgrade --no-cache-dir "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"

AzinY commented 1 month ago

Thanks guys! After Unsloth reinstallation and transformers upgrade -> 4.45.0 it works!

unslothai / unsloth

ValueError: Invalid `cache_implementation` (dynamic). Choose one of: ['static', 'offloaded_static', 'sliding_window', 'hybrid', 'mamba', 'quantized', 'static'] #1091