-
Hi,
I am trying to prune Mistral 7B (https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) and while I was able to successfully run the commands for magnitude pruning, I was facing issues with…
-
Hi! in the recipe, if i do not want to quantize and perform structured pruning, is it okk to give quantize:false like below and do not provide QuantizationModifier in the recipe?
SparseGPTModif…
-
Hi and thanks for the amazing repo.
I have a bit of tall request. SparseGPT uses a per-layer optimal brain surgeon approach to pruning. Here is the [pytorch code](https://github.com/IST-DASLab/spar…
-
Sparse should reduce the size and increase infer speed without hurting perf too much. This repo https://github.com/IST-DASLab/sparsegpt is Apache license and may be useful (I hope)
-
When will the tool be ready for use?
-
-
```
(textgen) [root@pve-m7330 sparsegpt]# python llama.py ../text-generation-webui/models/TinyLlama-1.1B-Chat-v1.0/ wikitext2 --nsamples 10
Token indices sequence length is longer than the specified…
-
**Describe the bug**
A clear and concise description of what the bug is.
**Hardware details**
Information about CPU and GPU, such as RAM, number, etc.
**Software version**
Version of relevant…
-
-
Hello, I find it extremely slow to do sparsegpt with tp=1 and pp=1. Will a larger number help? Thank you!