cpldcpu / BitNetMCU

Neural Networks with low bit weights on low end 32 bit microcontrollers such as the CH32V003 RISC-V Microcontroller and others
GNU General Public License v3.0
223 stars 20 forks source link

8bit quant #6

Closed panjea closed 1 month ago

panjea commented 1 month ago

The constructor in the base class supports, 8bit, line 45: elif self.QuantType == '8bit':

    def __init__(self, QuantType='Binary', WScale='PerTensor'):
        self.QuantType = QuantType
        self.WScale = WScale
        self.s = torch.nn.Parameter(torch.tensor(1.0))
        self.s.requires_grad = False  # no gradient for clipping scalar

        if self.QuantType in ['Binary', 'BinarySym']:
            self.bpw = 1
        elif self.QuantType in ['2bitsym']: 
            self.bpw = 2
        elif self.QuantType in ['Ternary']: 
            self.bpw = 1.6
        elif self.QuantType in ['4bit', '4bitsym', 'FP130']:
            self.bpw = 4
        elif self.QuantType == '5bitsym':
            self.bpw = 5
        elif self.QuantType == '8bit':
            self.bpw = 8
        else:
            raise AssertionError(f"Invalid QuantType: {self.QuantType}")

        if not self.WScale in ['PerOutput', 'PerTensor']:
            raise AssertionError(f"Invalid WScale: {self.WScale}. Expected one of: 'PerTensor', 'PerOutput'")

but when i pass my model to export_to_hfile, i get this error:

Layer: L1 Quantization type: <8bit>, Bits per weight: 8, Num. incoming: 12,  Num outgoing: 24
Skipping layer L1 with quantization type 8bit and 8 bits per weight. Quantization type not supported.
Traceback (most recent call last):
...
  File ".....py", line 85, in export_to_hfile
    reshaped_array = encoded_weights.reshape(-1, weight_per_word)
                     ^^^^^^^^^^^^^^^
UnboundLocalError: cannot access local variable 'encoded_weights' where it is not associated with a value
cpldcpu commented 1 month ago

Yes, I did not implement 8 bit expect for export. It's easy to add though. I just pushed the update.

The inference code does not yet support 8 bit though.

Generally, 8 bit weights are usually not needed with QAT. Typically 4bit weights are the most efficient.

cpldcpu commented 1 month ago

closed.