aws-neuron / aws-neuron-samples

Example code for AWS Neuron SDK developers building inference and training applications
Other
101 stars 32 forks source link

Running llama13b on inf2.24x #58

Closed sayli-ds closed 7 months ago

sayli-ds commented 7 months ago

The llama13b notebook runs fine on inf2.48x instance. While running it on inf2.24x, I reduced the tp_degree from 24 to 12 but the code throws an error in the following step-

neuron_model = LlamaForSampling.from_pretrained('./Llama-2-13b-split', batch_size=1, tp_degree=12, amp='f16') neuron_model.to_neuron()

Error FileNotFoundError: [Errno 2] No such file or directory: 'neuronx-cc'

Is this notebook supported on a 24x instance? Or what else might be missing? The environment setup is the same in both cases.