Closed GluttonK closed 1 year ago
Hello,
Here is a link with the best checkpoint for the two resnet-dcls runs of the paper along with the configs, summary and args used at that time. We used timm and a light training config for resnet (A3-config described in this paper).
Besides, this last checkpoints use the v0 version of DCLS that uses bilinear interpolation. We discovered that a better interpolation (Gaussian) could be used instead please refer to this paper. We encourage you, if you want to do further investigations, to retrain using this last interpolation technique. More details are provided in this short blog post along with a script to replace all your convolutions with DCLS ones (see medium story).
We abondonned the research on architectures that do not use depthwise separable convolutions because implementing a dialted kernel of size 7 and higher is prohibitive in time if we don't use depthwise separable convolutions along with depthwise implicit gemm method link to the code of this method and original paper. Maybe you could investigate ResNeXt or ConvNeXt (as we did in our paper) as they use depthwise separable convs.
Thank you very much, It helps a lot!!!
You're welcome,
I'm closing this issue, but feel free to reopen it if you have any questions or comments.
In your paper , I saw a result of ResNet50-dcls given besides the ConvNeXt-dcls. But I didn't find the trained model. If I need to train it by myself,Could you give the setting about it?Thank you