Closed Saafke closed 3 years ago
Please follow our code. The epoch number may be mistakenly wrong. We basically adopt the original Matterport code’s training protocol without too many customizations.
Best, He
On Thu, Nov 12, 2020 at 3:03 AM Xavier Weber notifications@github.com wrote:
Hi He,
I had a question about the length of the training process. In the paper you mention:
In the first stage of training, we freeze the ResNet50 weights and only train the layers in the heads, the RPN and FPN for 10K iterations. In the second stage, we freeze ResNet50 layers below level 4 and train for 3K iterations. In the final stage, we freeze ResNet50 layers below level 3 for another 70K iterations. When switching to each stage, we decrease the learning rate by a factor of 10.
From this it seems you only perform 70+10+3 = 83K iterations. Which means, at a batchsize of 2, you only train on 83*2=166K images, once. This is a bit confusing to me, as your training dataset is more than that, at 275K images.
However, in the code I can see that you train for 100+130+400 = 630K iterations.
GPU_COUNT = 1 IMAGES_PER_GPU = 2
Use a small epoch since the data is simple
STEPS_PER_EPOCH = 1000
print("Training network heads")
model.train(dataset_train, dataset_val, learning_rate=config.LEARNING_RATE, epochs=100, <========== 100K layers_name='heads')
Training - Stage 2
Finetune layers from ResNet stage 4 and up
print("Training Resnet layer 4+") model.train(dataset_train, dataset_val, learning_rate=config.LEARNING_RATE/10, epochs=130, <========== 130K layers_name='4+')
Training - Stage 3
Finetune layers from ResNet stage 3 and up
print("Training Resnet layer 3+") model.train(dataset_train, dataset_val, learning_rate=config.LEARNING_RATE/100, epochs=400, <========== 400K layers_name='all')
So I was wondering, how many iterations did you do for the experiments in the paper, and how many would you recommend doing for good performance?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/hughw19/NOCS_CVPR2019/issues/37, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEXSGH2IIXHSI6XHEU55GBTSPO6HPANCNFSM4TTEJVDQ .
What kind of equipment do you use for training and how much GPU memory at least
What kind of equipment do you use for training and how much GPU memory at least
We use Nvidia GeForce Titan Xp. 12 GB should be fine.
Hi He,
I had a question about the length of the training process. In the paper you mention:
From this it seems you only perform 70+10+3 = 83K iterations. Which means, at a batchsize of 2, you only train on 83*2=166K images, once. This is a bit confusing to me, as your training dataset is more than that, at 275K images.
However, in the code I can see that you train for 100+130+400 = 630K iterations.
So I was wondering, how many iterations did you do for the experiments in the paper, and how many would you recommend doing for good performance?