aws-neuron / aws-neuron-samples

Example code for AWS Neuron SDK developers building inference and training applications
Other
101 stars 32 forks source link

SD_1_5 Unet Compile #64

Closed petertran1811 closed 6 months ago

petertran1811 commented 6 months ago

Hello everybody. I got this error when trying to compile unet for sd 1.5. Even after reducing the image dimension to 256, the issue persists. Do you guys have any suggestions?

2023-12-26T08:37:54Z ERROR 26199 [job.WalrusDriver.0]: Backend exited with code -9 and stderr: 2023-12-26T08:37:54Z INFO 26191 [root]: Subcommand returned with exitcode=-9 2023-12-26T08:37:54Z ERROR 26191 [neuronxcc.driver.CommandDriver]: 2023-12-26T08:37:54Z ERROR 26191 [neuronxcc.driver.CommandDriver]: An Internal Compiler Error has occurred 2023-12-26T08:37:54Z ERROR 26191 [neuronxcc.driver.CommandDriver]: 2023-12-26T08:37:54Z ERROR 26191 [neuronxcc.driver.CommandDriver]: 2023-12-26T08:37:54Z USER 26191 [neuronxcc.driver.CommandDriver]: [F137] neuronx-cc was forcibly killed - This most commonly occurs due to insufficient system memory. Using a smaller data type, dimensions, batch size, or a larger instance type may help. 2023-12-26T08:37:54Z ERROR 26191 [neuronxcc.driver.CommandDriver]: 2023-12-26T08:37:54Z ERROR 26191 [neuronxcc.driver.CommandDriver]: Internal details: 2023-12-26T08:37:54Z ERROR 26191 [neuronxcc.driver.CommandDriver]: Type: <class 'RuntimeError'> 2023-12-26T08:37:54Z ERROR 26191 [neuronxcc.driver.CommandDriver]: File "neuronxcc/driver/CommandDriver.py", line 329, in neuronxcc.driver.CommandDriver.CommandDriver.run 2023-12-26T08:37:54Z USER 26191 [neuronxcc.driver.CommandDriver]: 2023-12-26T08:37:54Z USER 26191 [neuronxcc.driver.CommandDriver]: Diagnostic information: 2023-12-26T08:37:54Z USER 26191 [neuronxcc.driver.CommandDriver]: NeuronX Compiler version 2.12.54.0+f631c2365 2023-12-26T08:37:54Z USER 26191 [neuronxcc.driver.CommandDriver]:
2023-12-26T08:37:54Z USER 26191 [neuronxcc.driver.CommandDriver]: Python version 3.8.10 2023-12-26T08:37:54Z USER 26191 [neuronxcc.driver.CommandDriver]: HWM version 2.12.0.0-422c9037c 2023-12-26T08:37:54Z USER 26191 [neuronxcc.driver.CommandDriver]: NumPy version 1.24.4 2023-12-26T08:37:54Z USER 26191 [neuronxcc.driver.CommandDriver]:
2023-12-26T08:37:54Z USER 26191 [neuronxcc.driver.CommandDriver]: Running on AMI ami-0fdb13d8e11515ea4 2023-12-26T08:37:54Z USER 26191 [neuronxcc.driver.CommandDriver]: Running in region use1-az4 2023-12-26T08:37:54Z USER 26191 [neuronxcc.driver.CommandDriver]: 2023-12-26T08:37:54Z USER 26191 [neuronxcc.driver.CommandDriver]: Diagnostic logs stored in /home/ubuntu/dungtt/AI-Art/log-neuron-cc.txt

petertran1811 commented 6 months ago

It is memory errror