Unable to load quantized model from huggingface #207

Closed plvckn closed 1 year ago

plvckn commented 1 year ago

Steps to reproduce: run code sample from

from import IncQuantizedModelForSeq2SeqLM
int8_model = IncQuantizedModelForSeq2SeqLM.from_pretrained(

Error message:

AttributeError                            Traceback (most recent call last)
<ipython-input-1-4760d454e35e> in <module>
      1 from import IncQuantizedModelForSeq2SeqLM
----> 2 int8_model = IncQuantizedModelForSeq2SeqLM.from_pretrained(
      3     'Intel/distilbart-cnn-12-6-int8-dynamic',
      4 )

~/anaconda3/envs/nlp/lib/python3.9/site-packages/optimum/intel/neural_compressor/ in from_pretrained(cls, *args, **kwargs)
    599             f"`{cls.__name__.replace('IncQuantized', 'INC')}` instead."
    600         )
--> 601         return super().from_pretrained(*args, **kwargs)

~/anaconda3/envs/nlp/lib/python3.9/site-packages/optimum/intel/neural_compressor/ in from_pretrained(cls, model_name_or_path, q_model_name, **kwargs)
    540                     raise EnvironmentError(msg)
--> 542         if config.backend == "ipex":
    543             # NOTE: Will improve to use load function when Intel Neural Compressor next 2.1 release.
    544             # return load(state_dict_path)

~/anaconda3/envs/nlp/lib/python3.9/site-packages/transformers/ in __getattribute__(self, key)
    258         if key != "attribute_map" and key in super().__getattribute__("attribute_map"):
    259             key = super().__getattribute__("attribute_map")[key]
--> 260         return super().__getattribute__(key)
    262     def __init__(self, **kwargs):

AttributeError: 'BartConfig' object has no attribute 'backend'

My environment:

My cpu

Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
Address sizes:                   39 bits physical, 48 bits virtual
CPU(s):                          8
On-line CPU(s) list:             0-7
Thread(s) per core:              2
Core(s) per socket:              4
Socket(s):                       1
NUMA node(s):                    1
Vendor ID:                       GenuineIntel
CPU family:                      6
Model:                           158
Model name:                      Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz
Stepping:                        9
CPU MHz:                         3600.000
CPU max MHz:                     4200.0000
CPU min MHz:                     800.0000
BogoMIPS:                        7200.00
Virtualization:                  VT-x
L1d cache:                       128 KiB
L1i cache:                       128 KiB
L2 cache:                        1 MiB
L3 cache:                        8 MiB
NUMA node0 CPU(s):               0-7
Vulnerability Itlb multihit:     KVM: Mitigation: VMX disabled
Vulnerability L1tf:              Mitigation; PTE Inversion; VMX conditional cache flushes, SMT vulnerable
Vulnerability Mds:               Mitigation; Clear CPU buffers; SMT vulnerable
Vulnerability Meltdown:          Mitigation; PTI
Vulnerability Mmio stale data:   Mitigation; Clear CPU buffers; SMT vulnerable
Vulnerability Retbleed:          Mitigation; IBRS
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1:        Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:        Mitigation; IBRS, IBPB conditional, RSB filling, PBRSB-eIBRS Not affected
Vulnerability Srbds:             Mitigation; Microcode
Vulnerability Tsx async abort:   Mitigation; TSX disabled
Flags:                           fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts a
                                 cpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch
                                 _perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq 
                                 dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_
                                 2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowp
                                 refetch cpuid_fault invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexprior
                                 ity ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed a
                                 dx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hw
                                 p hwp_notify hwp_act_window hwp_epp md_clear flush_l1d arch_capabilities
plvckn commented 1 year ago

UPDATE fixed last error by using a different version of a package: optimum-intel-1.7.0.dev0 however, running same code sample now returns a different error:

KeyError                                  Traceback (most recent call last)
<ipython-input-1-47896b3f0ea0> in <module>
      1 from import INCModelForSeq2SeqLM
----> 2 int8_model = INCModelForSeq2SeqLM.from_pretrained(
      3     'Intel/distilbart-cnn-12-6-int8-dynamic',
      4 )

~/anaconda3/envs/nlp/lib/python3.9/site-packages/optimum/intel/neural_compressor/ in from_pretrained(cls, model_name_or_path, q_model_name, **kwargs)
    552         if "best_configure" in state_dict and state_dict["best_configure"] is not None:
--> 553             model = load(state_dict_path, model)
    555         return model.eval()

~/anaconda3/envs/nlp/lib/python3.9/site-packages/neural_compressor/utils/ in load(checkpoint_dir, model, history_cfg, **kwargs)
    370         _set_activation_scale_zeropoint(q_model, history_cfg)
    371     else:
--> 372         q_model.load_state_dict(stat_dict)
    373     util.get_embedding_contiguous(q_model)
    374     return q_model

~/anaconda3/envs/nlp/lib/python3.9/site-packages/torch/nn/modules/ in load_state_dict(self, state_dict, strict)
   1655                 )
-> 1657         load(self, state_dict)
   1658         del load

~/anaconda3/envs/nlp/lib/python3.9/site-packages/torch/nn/modules/ in load(module, local_state_dict, prefix)
   1643                     child_prefix = prefix + name + '.'
   1644                     child_state_dict = {k: v for k, v in local_state_dict.items() if k.startswith(child_prefix)}
-> 1645                     load(child, child_state_dict, child_prefix)
   1647             # Note that the hook can modify missing_keys and unexpected_keys.

~/anaconda3/envs/nlp/lib/python3.9/site-packages/torch/nn/modules/ in load(module, local_state_dict, prefix)
   1643                     child_prefix = prefix + name + '.'
   1644                     child_state_dict = {k: v for k, v in local_state_dict.items() if k.startswith(child_prefix)}
-> 1645                     load(child, child_state_dict, child_prefix)
   1647             # Note that the hook can modify missing_keys and unexpected_keys.

~/anaconda3/envs/nlp/lib/python3.9/site-packages/torch/nn/modules/ in load(module, local_state_dict, prefix)
   1643                     child_prefix = prefix + name + '.'
   1644                     child_state_dict = {k: v for k, v in local_state_dict.items() if k.startswith(child_prefix)}
-> 1645                     load(child, child_state_dict, child_prefix)
   1647             # Note that the hook can modify missing_keys and unexpected_keys.

~/anaconda3/envs/nlp/lib/python3.9/site-packages/torch/nn/modules/ in load(module, local_state_dict, prefix)
   1637         def load(module, local_state_dict, prefix=''):
   1638             local_metadata = {} if metadata is None else metadata.get(prefix[:-1], {})
-> 1639             module._load_from_state_dict(
   1640                 local_state_dict, prefix, local_metadata, True, missing_keys, unexpected_keys, error_msgs)
   1641             for name, child in module._modules.items():

~/anaconda3/envs/nlp/lib/python3.9/site-packages/torch/ao/nn/quantized/modules/ in _load_from_state_dict(self, state_dict, prefix, local_metadata, strict, missing_keys, unexpected_keys, error_msgs)
     55     def _load_from_state_dict(self, state_dict, prefix, local_metadata, strict,
     56                               missing_keys, unexpected_keys, error_msgs):
---> 57         self.dtype = state_dict[prefix + 'dtype']
     58         state_dict.pop(prefix + 'dtype')

KeyError: 'model.shared._packed_params.dtype'
xin3he commented 1 year ago

Hi, @plvckn I checked it and the root cause it's that transformers disabled sharing embedding, please try transformers <= v4.23.0. Thanks, I will add a note in the model card later.

plvckn commented 1 year ago

Downgrading to transformers v4.23 has solved this issue, thanks @xin3he