AniZpZ / AutoSmoothQuant

An easy-to-use package for implementing SmoothQuant for LLMs
MIT License
82 stars 7 forks source link

Maybe some problem you need to take care #17

Closed DonliFly closed 7 months ago

DonliFly commented 7 months ago

1)run smoothquant_model.py --quantize-model and --generate-scale are always True should change to use action="store_true",like this:


   parser.add_argument('--quantize-model',
                        action="store_true",
                        help='whether to quant model or not')    

    parser.add_argument('--generate-scale', 
                        action="store_true",
                        help='whether to generate scale or not')

2)TypeError: 'type' object is not subscriptable should change models folder all fill about dict, should do this to fix

a: from typing import Dict b: replaces form dict[str, str] to Dict[str, str]

Reference link:https://stackoverflow.com/questions/59101121/type-hint-for-a-dict-gives-typeerror-type-object-is-not-subscriptable 3)TypeError: cannot assign 'torch.cuda.HalfTensor' as parameter 'weight' (torch.nn.Parameter or None expected) under the models folder, using llama.py get the up TypeError, should do this to fix


class Int8LlamaRMSNorm(LlamaRMSNorm):

    @staticmethod
    def from_float(module: LlamaRMSNorm,
                   output_scale: float):
        int8_module = Int8LlamaRMSNorm(module.weight.numel(), module.variance_epsilon)

        int8_module.weight.to(module.weight.dtype)
        # int8_module.weight = module.weight / output_scale # get TypeError
        int8_module.weight = nn.Parameter(module.weight / output_scale)

        return int8_module

Reference link:https://discuss.pytorch.org/t/typeerror-cannot-assign-torch-floattensor-as-parameter-layer-weights-torch-nn-parameter-or-none-expected/94947

Thanks for this to help everybody!

AniZpZ commented 7 months ago

Hi there! We really appreciate your advice. We will give it serious consideration and plan to address the issues later this week.