rsk2327 / DistAya

2 stars 0 forks source link

Apply pruning algorithm #2

Open yaya-sy opened 4 months ago

yaya-sy commented 4 months ago

The sensitivity scores in results/sensitivities/ppl_sentivities_ppl.csv represent the importance of each layer: how the perplexity performance drops when a layer is dropped. For each layer, we have the perplexity of the Aya-23 8B when this layer is removed. The dataset used to compute the perplexity is wikitext.

We want to create an algorithm that prunes the model based on these sensitivity scores. This algorithm can be as simple as:

llm <- the original llm to prune
t <- compression rate: the number of layers to remove
layer_indices <- list of the layers sorted by sensitivities in decreasing order
pruned_layers <- layer_indices[:t]
pruned_llm <- remove pruned_layers from llm
cataluna84 commented 3 months ago

@yaya-sy, I used a classic pruning algorithm, but couldn't come close to ShortGPT's results. The algo. to be designed based on each layer's sensitivity scores can be modeled from other open-source repos

I have some free time now, so should I go ahead or have you completed this task?