[ICLR'23 Spotlight🔥] The first successful BERT/MAE-style pretraining on any convolutional network; Pytorch impl. of "Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling"
This is using spark network and replace the final project layer of convnext to Unet decoder. The loss is 0.16 roughly and the vis results is as following:
This using convnextv2 , the loss is 0.17 roughly
I think using unet decoder will let model learn some shortcuts for generate the images, as the training loss is almost the same.
This is using spark network and replace the final project layer of convnext to Unet decoder. The loss is 0.16 roughly and the vis results is as following:
This using convnextv2 , the loss is 0.17 roughly
I think using unet decoder will let model learn some shortcuts for generate the images, as the training loss is almost the same.