Closed VolodymyrAhafonov closed 11 months ago
Thank you for your attention, there is indeed a bug with the implementation here. I need to retrain the model to access the performance. The current implementation seems to be a strong regularization layer.
Thank you for your quick reply. As I understood from your answer, currently shared trained model was trained with this bug? If so, then could you please train model without this bug and share it?
Sure, I will fix this bug, retrain, and release the model. However, you can still use the current model for now, and it also works well. BTW, you may use this branch https://github.com/QLYoo/AEMatter/tree/PT20.
The issue has been fixed and I provided ckpt trained with the fixed codes. The new ckpt achieves better performance on Adobe Composition-1K.
Hello. Really good and interesting paper. Thank you for your work!
Recently I've experimented with your model an noticed interesting details in code of
AEALblock
class. Code block:It seems to me that
x1_
andx2_
are prepared inbatch_first
manner. But further they are processed viann.MultiheadAttention
module that is always initialized inbatch_first=False
mode so(b h1)
fromx1_
and(b w1)
fromx2_
are treated as sequence dimension. This seems wrong to me. Could you please clarify if this is bug or not? Thanks in advance.