Open DehuaTang opened 3 months ago
Thank you for your interest. It's feasible to achieve 99.9 ROUGE scores without fine-tuning for the 2:4 sparsified LLaMA-70B-Chat model. We utilized the ModelOpt sparsity package to accomplish this. The key difference may lie in the calibration dataset. We randomly selected a subset from the Open-Orca dataset, ensuring that test samples from MLPERF were excluded.
That's amazing! All the sparse papers with 2:4 sparse don't turn out to be 99.9 accurate. Do you have any plans to give the full code? It would help to reintroduce the industry to the impact of sparse on llm
Outstanding work! Thanks for the effort you guys put in! Is it really possible to achieve 99.9% accuracy and no fine-tuning for the llama-70b-chat in mlperf task with 2:4 sparse? I have reproduced and tested it using MTO and found that it only achieves 98% accuracy in fp16。 Are you able to give me some suggestions to reproduce this work? Like which hyperparameters need to be adjusted? Is it necessary to use fp8 fine-tuning to achieve 99.9% accuracy?