Open skaba04 opened 1 year ago
Known issue with this model since its super old. If occam's GPTQ module gets updated to work on newer Huggingface versions we can continue to support it, otherwise I suggest you to find a newer conversion of the model that is compatible with AutoGPTQ or to instead load the 16-bit version so KoboldAI quantizes it for you.
Ok so if nothing works and I cant seem to find the AutoGPTQ version. Only thing that left me is just wait.
But it says that it is quantanized with GPTQ-for-LLaMa is it any different?
Yes because it lacks the config file that our AutoGPTQ fallback needs. I cant legally tell you where to get reuploads but uploads compatible with the current KoboldAI United versions do exist so you dont have to wait.
Ok so where those uploads compatible with the current KoboldAI United versions are?
I have been running on runpod Pygmalion 13B model fine but whatever i am tring to load Pygmalion 7B or 6B the model is located here:https://huggingface.co/TehVenom/Pygmalion-7b-4bit-GPTQ-Safetensors
this error message showes up: 2023-08-28T23:04:34.578170425+02:00 File "/opt/koboldai/modeling/inference_models/gptq_hf_torch/class.py", line 238, in _load 2023-08-28T23:04:34.578171637+02:00 self.model = self._get_model(self.get_local_model_path()) 2023-08-28T23:04:34.578173024+02:00 │ │ │ │ │ └ <function HFInferenceModel.get_local_model_path at 0x7f543f713af0> 2023-08-28T23:04:34.578174162+02:00 │ │ │ │ └ <modeling.inference_models.gptq_hf_torch.class.model_backend object at 0x7f543f6b2130> 2023-08-28T23:04:34.578175309+02:00 │ │ │ └ <function model_backend._get_model at 0x7f543f70d670> 2023-08-28T23:04:34.578176464+02:00 │ │ └ <modeling.inference_models.gptq_hf_torch.class.model_backend object at 0x7f543f6b2130> 2023-08-28T23:04:34.578177571+02:00 │ └ None 2023-08-28T23:04:34.578178686+02:00 └ <modeling.inference_models.gptq_hf_torch.class.model_backend object at 0x7f543f6b2130> 2023-08-28T23:04:34.578179815+02:00 2023-08-28T23:04:34.578180982+02:00 File "/opt/koboldai/modeling/inference_models/gptq_hf_torch/class.py", line 392, in _get_model 2023-08-28T23:04:34.578182526+02:00 model = AutoGPTQForCausalLM.from_quantized(location, model_basename=Path(gptq_file).stem, use_safetensors=gptq_file.endswith(".safetensors"), device_map=device_map, inject_fused_attention=False) 2023-08-28T23:04:34.578184667+02:00 │ │ │ │ │ │ │ └ {'model.layers.0': 0, 'model.layers.1': 0, 'model.layers.2': 0, 'model.layers.3': 0, 'model.layers.4': 0, 'model.layers.5': 0... 2023-08-28T23:04:34.578185871+02:00 │ │ │ │ │ │ └ <method 'endswith' of 'str' objects> 2023-08-28T23:04:34.578187264+02:00 │ │ │ │ │ └ 'models/TehVenom_Pygmalion-7b-4bit-GPTQ-Safetensors/Pygmalion-7B-GPTQ-4bit.act-order.safetensors' 2023-08-28T23:04:34.578188433+02:00 │ │ │ │ └ 'models/TehVenom_Pygmalion-7b-4bit-GPTQ-Safetensors/Pygmalion-7B-GPTQ-4bit.act-order.safetensors' 2023-08-28T23:04:34.578189572+02:00 │ │ │ └ <class 'pathlib.Path'> 2023-08-28T23:04:34.578190775+02:00 │ │ └ 'models/TehVenom_Pygmalion-7b-4bit-GPTQ-Safetensors' 2023-08-28T23:04:34.578191927+02:00 │ └ <classmethod object at 0x7f53c847efd0> 2023-08-28T23:04:34.578194231+02:00 └ <class 'auto_gptq.modeling.auto.AutoGPTQForCausalLM'> 2023-08-28T23:04:34.578195447+02:00 2023-08-28T23:04:34.578196541+02:00 File "/opt/koboldai/runtime/envs/koboldai/lib/python3.8/site-packages/auto_gptq/modeling/auto.py", line 108, in from_quantized 2023-08-28T23:04:34.578197683+02:00 return quant_func( 2023-08-28T23:04:34.578198884+02:00 └ <bound method BaseGPTQForCausalLM.from_quantized of <class 'auto_gptq.modeling.llama.LlamaGPTQForCausalLM'>> 2023-08-28T23:04:34.578200011+02:00 File "/opt/koboldai/runtime/envs/koboldai/lib/python3.8/site-packages/auto_gptq/modeling/_base.py", line 757, in from_quantized 2023-08-28T23:04:34.578201126+02:00 quantize_config = BaseQuantizeConfig.from_pretrained(model_name_or_path, cached_file_kwargs, kwargs) 2023-08-28T23:04:34.578202302+02:00 │ │ │ │ └ {} 2023-08-28T23:04:34.578203428+02:00 │ │ │ └ {'cache_dir': None, 'force_download': False, 'proxies': None, 'resume_download': False, 'local_files_only': False, 'useauth... 2023-08-28T23:04:34.578204560+02:00 │ │ └ 'models/TehVenom_Pygmalion-7b-4bit-GPTQ-Safetensors' 2023-08-28T23:04:34.578205729+02:00 │ └ <classmethod object at 0x7f53ed2c8be0> 2023-08-28T23:04:34.578206890+02:00 └ <class 'auto_gptq.modeling._base.BaseQuantizeConfig'> 2023-08-28T23:04:34.578207977+02:00 File "/opt/koboldai/runtime/envs/koboldai/lib/python3.8/site-packages/auto_gptq/modeling/_base.py", line 93, in from_pretrained 2023-08-28T23:04:34.578209155+02:00 with open(resolved_config_file, "r", encoding="utf-8") as f: 2023-08-28T23:04:34.578210551+02:00 └ 'models/TehVenom_Pygmalion-7b-4bit-GPTQ-Safetensors/quantize_config.json' 2023-08-28T23:04:34.578211717+02:00 2023-08-28T23:04:34.578212846+02:00 FileNotFoundError: [Errno 2] No such file or directory: 'models/TehVenom_Pygmalion-7b-4bit-GPTQ-Safetensors/quantize_config.json'