Closed atonyo11 closed 7 months ago
By the way, I want to ask about baseline.yaml file.
decode_mode: beam model_args: num_classes: 1296 c2d_type: resnet18 #resnet18, mobilenet_v2, squeezenet1_1, shufflenet_v2_x1_0, efficientnet_b1, mnasnet1_0, regnet_y_800mf, vgg16_bn, vgg11_bn, regnet_x_800mf, regnet_x_400mf, densenet121, regnet_y_1_6gf conv_type: 2 use_bn: 1
Does num_classes: 1296 correct for default? I see in the paper that, phoenix2014 dataset have "a vocabulary of 1295 signs".
feeder_args: mode: 'train' datatype: 'video' num_gloss: -1 drop_ratio: 1.0 frame_interval: 1 image_scale: 1.0 # 0-1 represents ratio, >1 represents absolute value input_size: 224
input_size: 224 but why do we have to convert image to 256x256? Thanks
I encountered the same issue, and it seems like there's a 1% shortfall in the final training. How can I resolve this? Below is the log[]( log.txt
By the way, I want to ask about baseline.yaml file.
decode_mode: beam model_args: num_classes: 1296 c2d_type: resnet18 #resnet18, mobilenet_v2, squeezenet1_1, shufflenet_v2_x1_0, efficientnet_b1, mnasnet1_0, regnet_y_800mf, vgg16_bn, vgg11_bn, regnet_x_800mf, regnet_x_400mf, densenet121, regnet_y_1_6gf conv_type: 2 use_bn: 1
Does num_classes: 1296 correct for default? I see in the paper that, phoenix2014 dataset have "a vocabulary of 1295 signs".
feeder_args: mode: 'train' datatype: 'video' num_gloss: -1 drop_ratio: 1.0 frame_interval: 1 image_scale: 1.0 # 0-1 represents ratio, >1 represents absolute value input_size: 224
input_size: 224 but why do we have to convert image to 256x256? Thanks
The num_classes is actually calculated adaptively in line 51 in main.py. This hyperparameter doesn't make sense. About input size, it would randomly crop a 224×224 image from 256×256 input during training.
As for the training discrepancy, there is no guarantee that you could get exactly the same results across different platforms, affected by hardware change, software version and so on. But usually, the performance gap is within 1%. Especially, you even will get different results across two runs. I haven't found an effective way to solve this problem.
I got it. Thank you!
Hi. I tried run the default code to train phoenix2014 dataset. This is log fie: log.txt
However, I cannot get same results as the paper (1% worse than reported).
What parameters should I change to reproduce? Thanks!