Runpod problem with starting Pygmalion-7B

skaba04 commented 1 year ago

I have been running on runpod Pygmalion 13B model fine but whatever i am tring to load Pygmalion 7B or 6B the model is located here:https://huggingface.co/TehVenom/Pygmalion-7b-4bit-GPTQ-Safetensors

this error message showes up: 2023-08-28T23:04:34.578170425+02:00 2023-08-28T23:04:34.578171637+02:00 2023-08-28T23:04:34.578173024+02:00 2023-08-28T23:04:34.578174162+02:00 2023-08-28T23:04:34.578175309+02:00 2023-08-28T23:04:34.578176464+02:00 2023-08-28T23:04:34.578177571+02:00 2023-08-28T23:04:34.578178686+02:00 2023-08-28T23:04:34.578179815+02:00 2023-08-28T23:04:34.578180982+02:00 2023-08-28T23:04:34.578182526+02:00 2023-08-28T23:04:34.578184667+02:00 2023-08-28T23:04:34.578185871+02:00 2023-08-28T23:04:34.578187264+02:00 2023-08-28T23:04:34.578188433+02:00 2023-08-28T23:04:34.578189572+02:00 2023-08-28T23:04:34.578190775+02:00 2023-08-28T23:04:34.578191927+02:00 2023-08-28T23:04:34.578194231+02:00 2023-08-28T23:04:34.578195447+02:00 2023-08-28T23:04:34.578196541+02:00 2023-08-28T23:04:34.578197683+02:00 2023-08-28T23:04:34.578198884+02:00 2023-08-28T23:04:34.578200011+02:00 2023-08-28T23:04:34.578201126+02:00 2023-08-28T23:04:34.578202302+02:00 2023-08-28T23:04:34.578203428+02:00 2023-08-28T23:04:34.578204560+02:00 2023-08-28T23:04:34.578205729+02:00 2023-08-28T23:04:34.578206890+02:00 2023-08-28T23:04:34.578207977+02:00 2023-08-28T23:04:34.578209155+02:00 2023-08-28T23:04:34.578210551+02:00 2023-08-28T23:04:34.578211717+02:00 2023-08-28T23:04:34.578212846+02:00 File "/opt/koboldai/modeling/inference_models/gptq_hf_torch/class.py", line 238, in _load self.model = self._get_model(self.get_local_model_path()) │ │ │ │ │ └ <function HFInferenceModel.get_local_model_path at 0x7f543f713af0> │ │ │ │ └ <modeling.inference_models.gptq_hf_torch.class.model_backend object at 0x7f543f6b2130> │ │ │ └ <function model_backend._get_model at 0x7f543f70d670> │ │ └ <modeling.inference_models.gptq_hf_torch.class.model_backend object at 0x7f543f6b2130> │ └ None └ <modeling.inference_models.gptq_hf_torch.class.model_backend object at 0x7f543f6b2130> File "/opt/koboldai/modeling/inference_models/gptq_hf_torch/class.py", line 392, in _get_model model = AutoGPTQForCausalLM.from_quantized(location, model_basename=Path(gptq_file).stem, use_safetensors=gptq_file.endswith(".safetensors"), device_map=device_map, inject_fused_attention=False) │ │ │ │ │ │ │ └ {'model.layers.0': 0, 'model.layers.1': 0, 'model.layers.2': 0, 'model.layers.3': 0, 'model.layers.4': 0, 'model.layers.5': 0... │ │ │ │ │ │ └ <method 'endswith' of 'str' objects> │ │ │ │ │ └ 'models/TehVenom_Pygmalion-7b-4bit-GPTQ-Safetensors/Pygmalion-7B-GPTQ-4bit.act-order.safetensors' │ │ │ │ └ 'models/TehVenom_Pygmalion-7b-4bit-GPTQ-Safetensors/Pygmalion-7B-GPTQ-4bit.act-order.safetensors' │ │ │ └ <class 'pathlib.Path'> │ │ └ 'models/TehVenom_Pygmalion-7b-4bit-GPTQ-Safetensors' │ └ <classmethod object at 0x7f53c847efd0> └ <class 'auto_gptq.modeling.auto.AutoGPTQForCausalLM'> File "/opt/koboldai/runtime/envs/koboldai/lib/python3.8/site-packages/auto_gptq/modeling/auto.py", line 108, in from_quantized return quant_func( └ <bound method BaseGPTQForCausalLM.from_quantized of <class 'auto_gptq.modeling.llama.LlamaGPTQForCausalLM'>> File "/opt/koboldai/runtime/envs/koboldai/lib/python3.8/site-packages/auto_gptq/modeling/_base.py", line 757, in from_quantized quantize_config = BaseQuantizeConfig.from_pretrained(model_name_or_path, cached_file_kwargs, kwargs) │ │ │ │ └ {} │ │ │ └ {'cache_dir': None, 'force_download': False, 'proxies': None, 'resume_download': False, 'local_files_only': False, 'useauth... │ │ └ 'models/TehVenom_Pygmalion-7b-4bit-GPTQ-Safetensors' │ └ <classmethod object at 0x7f53ed2c8be0> └ <class 'auto_gptq.modeling._base.BaseQuantizeConfig'> File "/opt/koboldai/runtime/envs/koboldai/lib/python3.8/site-packages/auto_gptq/modeling/_base.py", line 93, in from_pretrained with open(resolved_config_file, "r", encoding="utf-8") as f: └ 'models/TehVenom_Pygmalion-7b-4bit-GPTQ-Safetensors/quantize_config.json' FileNotFoundError: [Errno 2] No such file or directory: 'models/TehVenom_Pygmalion-7b-4bit-GPTQ-Safetensors/quantize_config.json'

henk717 commented 1 year ago

Known issue with this model since its super old. If occam's GPTQ module gets updated to work on newer Huggingface versions we can continue to support it, otherwise I suggest you to find a newer conversion of the model that is compatible with AutoGPTQ or to instead load the 16-bit version so KoboldAI quantizes it for you.

skaba04 commented 1 year ago

Ok so if nothing works and I cant seem to find the AutoGPTQ version. Only thing that left me is just wait.

skaba04 commented 1 year ago

But it says that it is quantanized with GPTQ-for-LLaMa is it any different?

henk717 commented 1 year ago

Yes because it lacks the config file that our AutoGPTQ fallback needs. I cant legally tell you where to get reuploads but uploads compatible with the current KoboldAI United versions do exist so you dont have to wait.

skaba04 commented 1 year ago

Ok so where those uploads compatible with the current KoboldAI United versions are?

henk717 / KoboldAI

Runpod problem with starting Pygmalion-7B #446