Closed BDHU closed 1 year ago
Hi @BDHU , it is completely possible to see speedup on llama2 models similar to MPT as they are a very similar architecture! We don't have examples of it sparsified yet but are working on it and hope to share in the coming weeks
@BDHU If you want to be proactively be alerted to when our Llama 2 developments hit, I strongly encourage you and others reading to signup for our newsletter on the bottom right at https://neuralmagic.com/contact/ The traffic is low, I promise! We are very excited to bring these product developments to you! I will cross-reference your other similar inquiry just in case and close out this thread. Feel free to reopen if you have additional questions.
Best, Jeannie // Neural Magic
Is it possible to test the speedup gain using Llama2 when applying sparsity? If so are there any tutorials? Thanks!