Closed choprahetarth closed 3 months ago
Hi! Yes, SVD-LLM can compress the model(tensor) to a smaller size. But it does not change the shape of input and output channel. For example, the algorithm replaces the original weight matrix W with shape 128x128 with the multiplication of two smaller matrix A: 128x16 and B: 16x128. In this way, the number of stored values will be reduced from 128x128=16384 to 128x16x2=4096.
Changing the shape of input and output channel - from 128x128 to 64x64 is another kind of compression method. It could be achieved by structured pruning. You can refer this paper for more detail: https://arxiv.org/abs/2305.11627
Hello! I was wondering if this code can be manipulated to transform a tensor - say (32128128) to a smaller tensor (86464). Basically reduce the size of the llm layer by layer.