Deelvin / mlc-llm

Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.
https://mlc.ai/mlc-llm
Apache License 2.0
0 stars 0 forks source link

Theoretical analysis #3

Open vvchernov opened 8 months ago

vvchernov commented 8 months ago

Theoretical analysis of different cases of data distribution in activations and weights. Base parameters: context and outliers dispersion, distance between them, size of matrix, number of outliers. Some general consideration: number of outliers is less than context values; context dispersion can be of order of the distance or less; outliers dispersion is much less than the distance.

As the first step it can be assumed the following: matrices are square, weight data is distributed around zero and there are no outliers, number of outliers is much less than all values, but can be the order of the matrix size.