analogdevicesinc / ai8x-training

Model Training for ADI's MAX78000 and MAX78002 Edge AI Devices
Apache License 2.0
92 stars 80 forks source link

ReLU independent layer for the whole pipeline #332

Open Marie-TK opened 1 month ago

Marie-TK commented 1 month ago

It seems it's already discussed in #305, however,

class ReLU(nn.Module):
    def __init__(self,):
        super().__init__()

        if dev.simulate:
            self.quantize = Quantize(num_bits=dev.DATA_BITS)
            bits = dev.FC_ACTIVATION_BITS
            self.clamp = Clamp(min_val=-(2**(bits-1)), max_val=2**(bits-1)-1)
        else:
            self.quantize = Empty()
            self.clamp = Clamp(min_val=-1., max_val=127./128.)

        self.activate = nn.ReLU(inplace=True)

    def forward(self, x):  # pylint: disable=arguments-differ
        """Forward prop"""
        x = self.clamp(self.quantize(self.activate(x)))
        return x

When I evaluate the quantized model, I got all 0s inference result for test samples. When I run evaluation, it goes into if dev.simulate:

condition, but would you need below function when the model is already quantized? self.quantize = Quantize(num_bits=dev.DATA_BITS)

This function tries to set the value to either 0 or 1. Then the values are inherited to the latter layers, even though they expect values to be between -128 to +127.

When I set self.quantize = Empty() It seems it's working correctly, but I'm afraid if it's the right solution.

ermanok commented 2 weeks ago

Hi,

Could you please share which model you try to evaluate and explain more how you quantize that model.