Apply pruning algorithm

The sensitivity scores in results/sensitivities/ppl_sentivities_ppl.csv represent the importance of each layer: how the perplexity performance drops when a layer is dropped. For each layer, we have the perplexity of the Aya-23 8B when this layer is removed. The dataset used to compute the perplexity is wikitext.

We want to create an algorithm that prunes the model based on these sensitivity scores. This algorithm can be as simple as:

llm <- the original llm to prune
t <- compression rate: the number of layers to remove
layer_indices <- list of the layers sorted by sensitivities in decreasing order
pruned_layers <- layer_indices[:t]
pruned_llm <- remove pruned_layers from llm

rsk2327 / DistAya

Apply pruning algorithm #2