Closed remos-06 closed 1 year ago
You are either not using play.sh or something is hijacking your dependencies. Or wait I see this is colab, on colab we don't support Pygmalion since its banned there so I can not test or replicate this without getting my account banned. I will double check with a different model.
Can confirm this on Tiefighter, I think one of googles changes broke it but ill have to find out why.
2 days ago I was using Pygmalion without any issue
You are either not using play.sh or something is hijacking your dependencies. Or wait I see this is colab, on colab we don't support Pygmalion since its banned there so I can not test or replicate this without getting my account banned. I will double check with a different model.
I don't know what you did. it's fixed my issue
This is indeed fixed, flash-attn silently replaced their package, I replaced it now. I still do not recommend PygmalionAI on Colab because of the ban risk, you can use Tiefighter instead.
WARNING | modeling.inference_models.generic_hf_torch.class:_load:180 - Gave up on lazy loading due to Failed to import transformers.models.llama.modeling_llama because of the following error (look up to see its traceback): libcudart.so.12: cannot open shared object file: No such file or directory ERROR | modeling.inference_models.hf_torch:_get_model:430 - Lazyloader failed, falling back to stock HF load. You may run out of RAM here. Details: ERROR | modeling.inference_models.hf_torch:_get_model:431 - Failed to import transformers.models.llama.modeling_llama because of the following error (look up to see its traceback): libcudart.so.12: cannot open shared object file: No such file or directory ERROR | modeling.inference_models.hf_torch:_get_model:432 - Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/transformers/utils/import_utils.py", line 1282, in _get_module return importlib.import_module("." + module_name, self.name) File "/usr/lib/python3.10/importlib/init.py", line 126, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1050, in _gcd_import
File "", line 1027, in _find_and_load
File "", line 1006, in _find_and_load_unlocked
File "", line 688, in _load_unlocked
File "", line 883, in exec_module
File "", line 241, in _call_with_frames_removed
File "/usr/local/lib/python3.10/dist-packages/transformers/models/llama/modeling_llama.py", line 45, in
from flash_attn import flash_attn_func, flash_attn_varlen_func
File "/usr/local/lib/python3.10/dist-packages/flash_attn/init.py", line 3, in
from flash_attn.flash_attn_interface import (
File "/usr/local/lib/python3.10/dist-packages/flash_attn/flash_attn_interface.py", line 8, in
import flash_attn_2_cuda as flash_attn_cuda
ImportError: libcudart.so.12: cannot open shared object file: No such file or directory
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/content/KoboldAI-Client/modeling/inference_models/hf_torch.py", line 420, in _get_model model = AutoModelForCausalLM.from_pretrained( File "/usr/local/lib/python3.10/dist-packages/hf_bleeding_edge/init.py", line 50, in from_pretrained return AM.from_pretrained(path, *args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 564, in from_pretrained model_class = _get_model_class(config, cls._model_mapping) File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 387, in _get_model_class supported_models = model_mapping[type(config)] File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 739, in getitem return self._load_attr_from_module(model_type, model_name) File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 753, in _load_attr_from_module return getattribute_from_module(self._modules[module_name], attr) File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 697, in getattribute_from_module if hasattr(module, attr): File "/usr/local/lib/python3.10/dist-packages/transformers/utils/import_utils.py", line 1272, in getattr module = self._get_module(self._class_to_module[name]) File "/usr/local/lib/python3.10/dist-packages/transformers/utils/import_utils.py", line 1284, in _get_module raise RuntimeError( RuntimeError: Failed to import transformers.models.llama.modeling_llama because of the following error (look up to see its traceback): libcudart.so.12: cannot open shared object file: No such file or directory
INFO | modeling.inference_models.hf_torch:_get_model:433 - Falling back to stock HF load... WARNING | modeling.inference_models.hf_torch:_get_model:466 - Fell back to GPT2LMHeadModel due to Failed to import transformers.models.llama.modeling_llama because of the following error (look up to see its traceback): libcudart.so.12: cannot open shared object file: No such file or directory
Downloading (…)fetensors.index.json: 100% 29.9k/29.9k [00:00<00:00, 12.2MB/s] [aria2] Downloading model: 82%|########2 | 21.4G/26.0G [02:02<00:44, 105MB/s]Connection Attempt: 127.0.0.1 INFO | main:do_connect:2584 - Client connected! UI_2 [aria2] Downloading model: 100%|##########| 26.0G/26.0G [02:25<00:00, 179MB/s]You are using a model of type llama to instantiate a model of type gpt2. This is not supported for all configurations of models and can yield errors.
Downloading shards: 0% 0/3 [00:00<?, ?it/s] Downloading shards: 33% 1/3 [00:00<00:00, 5.09it/s] Downloading shards: 67% 2/3 [00:00<00:00, 5.08it/s] Downloading shards: 100% 3/3 [00:00<00:00, 4.92it/s]
I'm getting this error