pretrained model performs not as expected

Greetings!

Recently I'm interested in the lite-DETR work. Unfortunately, several experiments with provided pre-trained model on my computer did not perform as expected. Precisely, the AP statistics shown on my computer are far below the statement in the paper. I'm wondering if there is any mistake in my experiment procedures, the enviroment I am using, or if the model I downloaded from readme is not correct.

I used the model named Lite-DINO-H3L1-(6+1)x1. Due to the fact that I use one PC to reproduce the experiment, the command I use to eval is

python main.py --eval --config_file ./config/DINO/DINO_4scale.py --coco_path D:/coco2017 --options num_expansion=1 enc_scale=3 --resume ckpt/r50_s3ex1_50.2.pth --output_dir output/

however, the performance I get from val2017 is shown below

IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.027
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.064
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.019
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.004
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.039
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.064
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.070
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.142
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.208
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.050
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.228
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.356

From the paper, I think the reasonable AP IoU 0.50:0.95 should be around 45%. As you know, the Table 6 shows that the AP of Lite-DETR model could reach 46.2%. Different enviroment may be, I think the AP I get from experiment should be greater than 45%. From my perspective, the performance with just 2% AP cannot be an acceptable result. This result repeated several time in my computer. I don't know where could be wrong.

The Computer I use has one CPU with AMD Ryzen 9 7950X3D and a GPU RTX 4090.

Looking forward to your reply!

best regards.

IDEA-Research / Lite-DETR

pretrained model performs not as expected #8