RuntimeError: Neuron runtime cannot be initialized; cannot determine the number of available NeuronCores

aws-neuron / aws-neuron-samples

Example code for AWS Neuron SDK developers building inference and training applications

Other

101 stars 32 forks source link

RuntimeError: Neuron runtime cannot be initialized; cannot determine the number of available NeuronCores #44

Closed eric80116 closed 9 months ago

eric80116 commented 9 months ago

I tried to run the hf_pretrained_sd2_512_inference.ipynb on inf2.8xlarge with compiler version NeuronX Compiler version 2.10.0.34+6c8792c6f and got the RuntimeError when loading the model even the compile finished successfully. The message shows "RuntimeError: Neuron runtime cannot be initialized; cannot determine the number of available NeuronCores"when I tried to load the unet onto neuron cores by the following script. pipe.unet.unetwrap = torch_neuronx.DataParallel(torch.jit.load(unet_filename), device_ids, set_dynamic_batching=False) Any idea? thanks

aws-qing commented 9 months ago

Hi @eric80116,

Thanks for reporting this issue. I believe this may be due to other ipynb processes not having terminated properly. Please use sudo pkill -9 python to find and kill all python and ipynb processes, and restart the notebook kernel. You should make sure this line ERROR TDRV:tdrv_get_dev_info No neuron device available is not present when running the model.

You can also verify if there are any unwanted python processes running with htop

eric80116 commented 9 months ago

Hi Qing, Thanks for help. It works now.