Closed bollossom closed 8 months ago
Thank you! 200 ep for pretraining can be a bit insufficient. You may need to adjust some hyperparameters like doubling the learning rate, decreasing the drop path rate, etc.
For Hybird CNN-transformer backbone, SparK can be directly applied on them because SparK does not change the model architecture or parameters. You can refer to our SparK to use sparse layers on CNN (like what we do in https://github.com/keyu-tian/SparK/blob/main/pretrain/encoder.py#L165) and use a multi-scale decoder to reconstruct images (like in https://github.com/keyu-tian/SparK/blob/main/pretrain/decoder.py).
ok, Thank you very much!!!!
Hello, I found during pre-training that using the sparse convolution you gave, it takes about 100 minutes to train one epoch. Is there anything I can do to speed this up? The training settings are: a model with a total size of about 70M, trained using 8 A100s with 60G memory. For example, use the sparse convolution implementation in MinkowskiEngine or write a cuda operator?
@bollossom can you provide details on model type, dataset size and input size, batch size, and GPU utilization? BTW what does 60G video memory mean?
Generally, I believe using MinkowskiEngine won't speed up too much because 1) the masked images are way denser than 3d point clouds, and 2) the lack of optimization on sparse depthwise convolution, sparse group norm, etc.
好的,我首先对我之前的错误的叙述抱歉,我们的模型是使用了sparsk的MC-MAE [base],bs为512,在imagenet上预训练,使用8张64GB的显卡跑,目前跑预训练一轮大概100分钟左右, 不知道有没有好的加速方式
这个训练速度感觉应该有bug,可以排查下,比如看下GPU利用率和显存占用,或者用 pytorch profiler log一下哪里速度慢 作为参考,ConvNeXt-Base bs=4096 on 32 A100s,每个 epoch 约5分钟
好的,谢谢您的细心指导,请问spark-ResNet-101, bs=4096,32 A100s,预训练大概多少分钟一轮呀
和ConvNeXt-Base 差不多
好的谢谢您
Hello, What a nice job SparsK is!!