Open hanyuxuan123 opened 10 months ago
Have you solved ? I meet the same issue.
Have you solved ? I meet the same issue.
No, I haven't solved. Now I am trying to replace the Unet and cross attention part with the single CNN part, I wonder maybe the Unet part has problems
I have solved the problems above today, by changing several codes. Firstly, the diffusers in requirement.txt is not 0.14.2. You may download the 0.10.2 in model and tap into the diffuser folder in the code and run "pip install ." to install diffuser 0.10.2. Then the unet_2D_block.py has some problems. Especially the CrossAttnUp and Upsample block. You can refer to the CrossAttnDown and DownSample to change the code forward part. Then the training code needs several changes. Later I will update the code I changed. I also test the final result, the decorder can perform well.
Hi there, thank you for opening up this issue and I encountered the same problem.
Would you mind sharing the code for fixing the problem? Thank you so much!
Hi guys,
Thank you for your assistance. I resolved the issue by referring to this comment.
The problem was due to a mismatch of the UNet blocks from diffusers. In newer versions, the "CrossAttention" module has been replaced with "Attention".
Thank you all again for your help!
In the command: sh ./script/train_semantic_Cityscapes.sh to train the semantic result
File "/root/project_yuxuan/DatasetDM/model/segment/transformer_decoder.py", line 813, in _prepare_features attention_maps_8s = aggregate_attention(attention_store, 8, ("up", "mid", "down"), True, select,prompts=prompts) File "/root/project_yuxuan/DatasetDM/model/segment/transformer_decoder.py", line 532, in aggregate_attention for item in attentionmaps[f"{location}{'cross' if is_cross else 'self'}"]: KeyError: 'up_cross' I found that in the paper . it is the transformer decoder part of preception decoder.