ldery / Bonsai

Code for "Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes"
Apache License 2.0
21 stars 0 forks source link

May I ask if this tool is currently unable to perform pruning on GQA models? #4

Open yeliang2258 opened 2 months ago

yeliang2258 commented 2 months ago

May I ask if this tool is currently unable to perform pruning on GQA models? Llama2-70B or Llama3

ldery commented 2 months ago

Hi, Yes, currently we don't have the setup for GQA yet. But it should be in the next release in a few weeks when the repo is in a better state.

yeliang2258 commented 1 month ago

May I ask again, When will GQA model(Llama3) compression be supported?

ldery commented 1 month ago

Hopefully by end-of this week. We'll keep this thread open

Lhemamou commented 1 day ago

any news ?