Open nandeeka opened 1 month ago
Hi @nandeeka, this LLaVa model has too many parameters and when trying to compile to one Neuron Core, it runs out of memory. We recommend you consider using neuronx-distributed https://awsdocs-neuron.readthedocs-hosted.com/en/latest/libraries/neuronx-distributed/index.html to allow for tensor parallelism to shard the weights across multiple neuron cores.
Hi @aws-yishanm , Thanks for getting back to me. I will take a look and let you know if I have more questions. Thanks!
I am trying compile LLaVA 1.5 7B to Neuron. As far as I can tell, the way to do this is to select some specific inputs and then trace the model execution with those inputs. However, when I try to trace the model, I get the error:
I have seen this error with both trn1.2xlarge and trn1.32xlarge on the most recent Neuron DLAMI.
The source-code to reproduce my setup is: