Maybe some problem you need to take care

1）run smoothquant_model.py --quantize-model and --generate-scale are always True should change to use action="store_true"，like this：


   parser.add_argument('--quantize-model',
                        action="store_true",
                        help='whether to quant model or not')    

    parser.add_argument('--generate-scale', 
                        action="store_true",
                        help='whether to generate scale or not')

2）TypeError: 'type' object is not subscriptable should change models folder all fill about dict, should do this to fix

a: from typing import Dict b: replaces form dict[str, str] to Dict[str, str]

Reference link：https://stackoverflow.com/questions/59101121/type-hint-for-a-dict-gives-typeerror-type-object-is-not-subscriptable 3）TypeError: cannot assign 'torch.cuda.HalfTensor' as parameter 'weight' (torch.nn.Parameter or None expected) under the models folder, using llama.py get the up TypeError, should do this to fix


class Int8LlamaRMSNorm(LlamaRMSNorm):

    @staticmethod
    def from_float(module: LlamaRMSNorm,
                   output_scale: float):
        int8_module = Int8LlamaRMSNorm(module.weight.numel(), module.variance_epsilon)

        int8_module.weight.to(module.weight.dtype)
        # int8_module.weight = module.weight / output_scale # get TypeError
        int8_module.weight = nn.Parameter(module.weight / output_scale)

        return int8_module

Reference link：https://discuss.pytorch.org/t/typeerror-cannot-assign-torch-floattensor-as-parameter-layer-weights-torch-nn-parameter-or-none-expected/94947

Thanks for this to help everybody!

AniZpZ / AutoSmoothQuant

Maybe some problem you need to take care #17