keyu-tian / SparK

[ICLR'23 Spotlight🔥] The first successful BERT/MAE-style pretraining on any convolutional network; Pytorch impl. of "Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling"
https://arxiv.org/abs/2301.03580
MIT License
1.46k stars 84 forks source link

Wrose generated image compare to Convnextv2 #90

Open powermano opened 1 month ago

powermano commented 1 month ago

This is using spark network and replace the final project layer of convnext to Unet decoder. The loss is 0.16 roughly and the vis results is as following: test_epoch_30

This using convnextv2 , the loss is 0.17 roughly test_epoch_60

I think using unet decoder will let model learn some shortcuts for generate the images, as the training loss is almost the same.