wangf3014 / Adventurer

13 stars 0 forks source link

Comparision with the ResNet (Any CNN) Architecture #1

Open sivaji123256 opened 1 month ago

sivaji123256 commented 1 month ago

Hi @wangf3014 , Thanks for the great work. I have gone through the paper. Results seems very promising and exciting to implement. I am just wondering whether have you got a chance to compare all the performance metrics with any CNN architecture like ResNet. If yes, Can you pls share them as well? Thanks in advance .

wangf3014 commented 1 month ago

Thanks for your interest in our work. ResNet-50, ResNet-101 have 76.2 and 77.4 ImageNet Top-1 accuracy, respectively. For ADE20k semantic segmentation, COCO object detection and instance segmentation tasks, we have reported ResNet-50/101's results in the paper. Let me know if any other information can help you.

sivaji123256 commented 1 month ago

@wangf3014 ,Thanks for your prompt response. 1, What about the training time, inference time, throughput , Memory Usage comparison because in some of my experiments Vim performs same as ResNet even though authors claimed that its better than CNN architectures. 2, Also, I couldn't understand the training at different resolutions and their corresponding naming like pretraining , train and finetune as I haven't implemented Mamba-Reg model before. Also, I was trying to train and evaluate on custom dataset. Do I need to follow the same procedure? I am thinking of training on images of 1024*1024. Could you pls guide me in this? Thanks in advance.

wangf3014 commented 1 month ago
I just tested the training speed and memory of ResNet-50 and ResNet-152: Model Throughput Memory Accuracy
ResNet-50 1630 6.6G 76.2%
ResNet-152 784 12.5G 78.3%
wangf3014 commented 1 month ago

You don't have to follow the multi-stage training scheme for your custom data. You can surely train from scratch at 1024*1024 or finetune our pretrained models on your custom dataset.