aws-neuron / aws-neuron-samples

Example code for AWS Neuron SDK developers building inference and training applications
Other
122 stars 33 forks source link

Issue with compiling SD1.5 based model with neuronx #13

Closed Atik16209 closed 1 year ago

Atik16209 commented 1 year ago

Hi! I am trying to convert an SD1.5 based model with neuronx following this example https://github.com/aws-neuron/aws-neuron-samples/blob/master/torch-neuronx/inference/hf_pretrained_sd2_512_inference.ipynb

What I did was: 1)launch an aws ec2 inf2.8xlarge instance

2)run sudo apt-get install linux-headers-$(uname -r) -y sudo apt-get install aws-neuronx-dkms --allow-change-held-packages -y source /opt/aws_neuron_venv_pytorch/bin/activate

3)Follow the guide for sd2, but commented out the cross_atention modification and changed the shape of encoder_hidden_states to match the shape of SD1.5

All parts except unet compile fine, but unet fails with an error.

Here is the code that fails and attached are the error log and traceback log:

pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) pipe.unet = NeuronUNet(UNetWrap(pipe.unet)) unet = copy.deepcopy(pipe.unet.unetwrap) del pipe sample_1b = torch.randn([1, 4, 64, 64]).bfloat16() timestep_1b = torch.tensor(999).bfloat16().expand((1,)) encoder_hidden_states_1b = torch.randn([1, 77, 768]).bfloat16() example_inputs = sample_1b, timestep_1b, encoder_hidden_states_1b unet_neuron = torch_neuronx.trace( unet, example_inputs, compiler_workdir=os.path.join(COMPILER_WORKDIR_ROOT, 'unet'), compiler_args=["--model-type=unet-inference"] )

traceback.log error.log

Is this a bug with the neuronx or am I doing something wrong? Thanks

jeffhataws commented 1 year ago

Hi @Atik16209, thanks for reporting the issue. We will take a look and get back to you.

aws-donkrets commented 1 year ago

Hi @Atik16209, the recent (2.11) release of the Neuron SDK should include a fix for your issue. Please download it and let us know if you can successfully compile your model

Atik16209 commented 1 year ago

Hi! Sorry for the late reply. Thank you, the compiling is working now.