Closed lunachy closed 3 years ago
Yes, we also tried it, but the performance was not that significantly improved (or even decreased), probably due to overfitting. We actually set up 2 directly and found the overall effect to be good :joy:, so we didn't carefully tune the number of transformer layers.
According to the paper, I guess you test this on tusimple dataset.
Yes, the TuSimple validation dataset.
Do you make grid search about parameters of enc_layers、dec_layers? According to DETR, more encoder layers and decoder layers are beneficial. Maybe enc_layers and dec_layers set to 3 at least.