I've adapted ConvNetV2 for time series signal analysis by setting the CNN height to 1. After pre-training and visualizing, I found that the prediction values exhibit a high level of randomness. In comparison with ViT-MAE, the reconstruction performance is quite suboptimal. However, during fine-tuning, both the training and validation accuracy and loss metrics outperform those of ViT-MAE. The performance on the test set, however, falls a bit short. Should I use a larger baseline version for pre-training?Do you have any other suggestions for me?
PS:1.ViT-MAE uses 16 layers of transformers,patch size is 4,
ConvNext atto: depths=[2, 2, 6, 2], dims=[40, 80, 160, 320]
2.I haven't used any built-in training methods from TIMM; I've implemented a simple training loop using EMA, AdamW, and cosine decay.
@heng3366: Sorry, it's not relevant to the posted question but could you please let me know the version of Ubuntu, G++, and CUDA you used to install the MinkowskiEngine?
I've adapted ConvNetV2 for time series signal analysis by setting the CNN height to 1. After pre-training and visualizing, I found that the prediction values exhibit a high level of randomness. In comparison with ViT-MAE, the reconstruction performance is quite suboptimal. However, during fine-tuning, both the training and validation accuracy and loss metrics outperform those of ViT-MAE. The performance on the test set, however, falls a bit short. Should I use a larger baseline version for pre-training?Do you have any other suggestions for me?
PS:1.ViT-MAE uses 16 layers of transformers,patch size is 4, ConvNext atto: depths=[2, 2, 6, 2], dims=[40, 80, 160, 320] 2.I haven't used any built-in training methods from TIMM; I've implemented a simple training loop using EMA, AdamW, and cosine decay.
Thank anyway