model = AutoModel.from_pretrained("Qwen/Qwen2-0.5B",#hf_token='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',
compression='4bit',layer_shards_saving_path="E:\airllm\shards")
=====Error Msg==========
The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable.
bitsandbytes installed
cache_utils installed
unknown artichitecture: Qwen2ForCausalLM, try to use Llama2...
Fetching 9 files: 100%|████████████████████████████████████████████████████████████████| 9/9 [00:00<00:00, 1971.73it/s]
Traceback (most recent call last):
File "E:\airllm\run.py", line 6, in
model = AutoModel.from_pretrained("Qwen/Qwen2-0.5B",#hf_token='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',
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\NINGMEI\AppData\Roaming\Python\Python311\site-packages\airllm\auto_model.py", line 54, in frompretrained
return class(pretrained_model_name_or_path, *inputs, * kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\NINGMEI\AppData\Roaming\Python\Python311\site-packages\airllm\airllm.py", line 9, in init
super(AirLLMLlama2, self).init(args, **kwargs)
File "C:\Users\NINGMEI\AppData\Roaming\Python\Python311\site-packages\airllm\airllm_base.py", line 108, in init
self.model_local_path, self.checkpoint_path = find_or_create_local_splitted_path(model_local_path_or_repo_id,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\NINGMEI\AppData\Roaming\Python\Python311\site-packages\airllm\utils.py", line 382, in find_or_create_local_splitted_path
return Path(hf_cache_path), split_and_save_layers(hf_cache_path, layer_shards_saving_path,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\NINGMEI\AppData\Roaming\Python\Python311\site-packages\airllm\utils.py", line 213, in split_and_save_layers
assert os.path.exists(checkpoint_path / 'model.safetensors.index.json'), f'model.safetensors.index.json should exist.'
AssertionError: model.safetensors.index.json should exist.
from airllm import AutoModel
MAX_LENGTH = 128
could use hugging face model repo id:
model = AutoModel.from_pretrained("Qwen/Qwen2-0.5B",#hf_token='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', compression='4bit',layer_shards_saving_path="E:\airllm\shards")
=====Error Msg==========
The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable.