sanmin0312 / P2D

[ICCV 2023] Predict to Detect: Prediction-guided 3D Object Detection using Sequential Images
13 stars 1 forks source link

assert set(self.pred_boxes.sample_tokens) == set(self.gt_boxes.sample_tokens), \ AssertionError: Samples in split doesn't match samples in predictions. #3

Open lsy19103118 opened 7 months ago

sanmin0312 commented 6 months ago

Could you provide additional context or details?

lsy19103118 commented 6 months ago

Thank you very much for your letter!

I ran the following four commands:

1: python ./exps/nuscenes/p2d/p2d_deform_lss_r50_256x704_128x128_24e_3key_ema.py --amp_backend native - b 8 --gpus  2 (nuscenes_infos_trainval.pkl)

2: python ./exps/nuscenes/p2d/p2d_deform_lss_r50_256x704_128x128_24e_3key.py --ckpt_path  weights/P2D_res50.pth - b 8  --gpus 2 (nuscenes_infos_trainval.pkl)

3: python ./exps/nuscenes/p2d/p2d_deform_lss_r50_256x704_128x128_24e_3key.py --ckpt_path  weights/P2D_res50.pth - b 8  --gpus 2 python ./exps/nuscenes/p2d/(p2d_deform_lss_r50_256x704_128x128_24e_3key.py --ckpt_path  weights/P2D_res50.pth -e - b 8  --gpus 2)

(nuscenes_infos_trainval.pkl)

4: python ./exps/nuscenes/p2d/p2d_deform_lss_r50_256x704_128x128_24e_3key.py --ckpt_path  weights/P2D_res50.pth -p - b 8  --gpus 2 (nuscenes_infos_trainval.pkl)

Only 4 of them can work properly, and 1, 2, and 3 all report the following error: The File "/ home/m840-02 / anaconda3 / envs/p2d/lib/python3.8 / site - packages/nuscenes/eval/detection/evaluate py", line 85, in init assert set(self.pred_boxes.sample_tokens) == set(self.gt_boxes.sample_tokens), \ AssertionError: Samples in split doesn't match samples in predictions.

I found that 1, print(len(self.pred_boxes.sample_tokens)) = 3008

print(len(self.gt_boxes.sample_tokens)) = 6019

At the same time I modified:

info_path in scripts/gen_depth_gt.py = 'data/nuScenes/nuscenes_infos_trainval.pkl' #'data/nuScenes/nuscenes_infos_val.pkl' FileNotFoundError: [Errno 2] No such file or directory: 'data/nuScenes/depth_gt/n015-2018-09-27-15-33-17+0800CAM_FRONT_LEFT1538033983154844.jpg.bin' (Issue #4)

How can I modify the normal training and verification? Did you modify the nuscenes or pytorch_lightning source code? Or is the parameter wrong?) I suspect it might be because self.gt_boxes = load_gt(self.nusc, self.eval_set, DetectionBox, verbose=verbose) self.eval_set is "val" And we need "train" and "test."

I wish you a smooth scientific research life!

The complete error message is as follows: /home/m840-02/anaconda3/envs/p2d/bin/python /home/m840-02/lsy/P2D-master/p2d/exps/nuscenes/p2d/p2d_deform_lss_r50_256x704_128x128_24e_3key_ema.py --amp_backend native -b 8 --gpus 2 Global seed set to 0 2024-04-07 20:39:02,698 - mmcv - INFO - initialize SECONDFPN with init_cfg [{'type': 'Kaiming', 'layer': 'ConvTranspose2d'}, {'type': 'Constant', 'layer': 'NaiveSyncBatchNorm2d', 'val': 1.0}] 2024-04-07 20:39:02,709 - mmcv - INFO - deblocks.0.0.weight - torch.Size([128, 256, 4, 4]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:02,709 - mmcv - INFO - deblocks.0.1.weight - torch.Size([128]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:02,710 - mmcv - INFO - deblocks.0.1.bias - torch.Size([128]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:02,710 - mmcv - INFO - deblocks.1.0.weight - torch.Size([128, 512, 2, 2]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:02,710 - mmcv - INFO - deblocks.1.1.weight - torch.Size([128]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:02,710 - mmcv - INFO - deblocks.1.1.bias - torch.Size([128]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:02,710 - mmcv - INFO - deblocks.2.0.weight - torch.Size([1024, 128, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:02,710 - mmcv - INFO - deblocks.2.1.weight - torch.Size([128]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:02,710 - mmcv - INFO - deblocks.2.1.bias - torch.Size([128]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:02,710 - mmcv - INFO - deblocks.3.0.weight - torch.Size([2048, 128, 2, 2]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:02,710 - mmcv - INFO - deblocks.3.1.weight - torch.Size([128]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:02,710 - mmcv - INFO - deblocks.3.1.bias - torch.Size([128]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:02,718 - mmcv - INFO - initialize ResNet with init_cfg {'type': 'Pretrained', 'checkpoint': 'torchvision://resnet50'} 2024-04-07 20:39:02,718 - mmcv - INFO - load model from: torchvision://resnet50 2024-04-07 20:39:02,719 - mmcv - INFO - load checkpoint from torchvision path: torchvision://resnet50 2024-04-07 20:39:02,822 - mmcv - WARNING - The model and loaded state dict do not match exactly

unexpected key in source state_dict: fc.weight, fc.bias

2024-04-07 20:39:03,008 - mmcv - INFO - initialize ResNet with init_cfg [{'type': 'Kaiming', 'layer': 'Conv2d'}, {'type': 'Constant', 'val': 1, 'layer': ['_BatchNorm', 'GroupNorm']}] 2024-04-07 20:39:03,135 - mmcv - INFO - initialize BasicBlock with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm2'}} 2024-04-07 20:39:03,136 - mmcv - INFO - initialize BasicBlock with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm2'}} 2024-04-07 20:39:03,137 - mmcv - INFO - initialize BasicBlock with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm2'}} 2024-04-07 20:39:03,138 - mmcv - INFO - initialize BasicBlock with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm2'}} 2024-04-07 20:39:03,140 - mmcv - INFO - initialize BasicBlock with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm2'}} 2024-04-07 20:39:03,142 - mmcv - INFO - initialize BasicBlock with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm2'}} 2024-04-07 20:39:03,146 - mmcv - INFO - conv1.weight - torch.Size([160, 160, 7, 7]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:03,147 - mmcv - INFO - bn1.weight - torch.Size([160]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,147 - mmcv - INFO - bn1.bias - torch.Size([160]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,147 - mmcv - INFO - layer1.0.conv1.weight - torch.Size([160, 160, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:03,147 - mmcv - INFO - layer1.0.bn1.weight - torch.Size([160]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,147 - mmcv - INFO - layer1.0.bn1.bias - torch.Size([160]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,147 - mmcv - INFO - layer1.0.conv2.weight - torch.Size([160, 160, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:03,147 - mmcv - INFO - layer1.0.bn2.weight - torch.Size([160]): ConstantInit: val=0, bias=0   2024-04-07 20:39:03,147 - mmcv - INFO - layer1.0.bn2.bias - torch.Size([160]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,147 - mmcv - INFO - layer1.1.conv1.weight - torch.Size([160, 160, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:03,147 - mmcv - INFO - layer1.1.bn1.weight - torch.Size([160]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,147 - mmcv - INFO - layer1.1.bn1.bias - torch.Size([160]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,147 - mmcv - INFO - layer1.1.conv2.weight - torch.Size([160, 160, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:03,147 - mmcv - INFO - layer1.1.bn2.weight - torch.Size([160]): ConstantInit: val=0, bias=0   2024-04-07 20:39:03,147 - mmcv - INFO - layer1.1.bn2.bias - torch.Size([160]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,147 - mmcv - INFO - layer2.0.conv1.weight - torch.Size([320, 160, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:03,147 - mmcv - INFO - layer2.0.bn1.weight - torch.Size([320]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,147 - mmcv - INFO - layer2.0.bn1.bias - torch.Size([320]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,147 - mmcv - INFO - layer2.0.conv2.weight - torch.Size([320, 320, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:03,147 - mmcv - INFO - layer2.0.bn2.weight - torch.Size([320]): ConstantInit: val=0, bias=0   2024-04-07 20:39:03,147 - mmcv - INFO - layer2.0.bn2.bias - torch.Size([320]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,147 - mmcv - INFO - layer2.0.downsample.0.weight - torch.Size([320, 160, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:03,147 - mmcv - INFO - layer2.0.downsample.1.weight - torch.Size([320]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,147 - mmcv - INFO - layer2.0.downsample.1.bias - torch.Size([320]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,147 - mmcv - INFO - layer2.1.conv1.weight - torch.Size([320, 320, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:03,148 - mmcv - INFO - layer2.1.bn1.weight - torch.Size([320]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,148 - mmcv - INFO - layer2.1.bn1.bias - torch.Size([320]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,148 - mmcv - INFO - layer2.1.conv2.weight - torch.Size([320, 320, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:03,148 - mmcv - INFO - layer2.1.bn2.weight - torch.Size([320]): ConstantInit: val=0, bias=0   2024-04-07 20:39:03,148 - mmcv - INFO - layer2.1.bn2.bias - torch.Size([320]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,148 - mmcv - INFO - layer3.0.conv1.weight - torch.Size([640, 320, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:03,148 - mmcv - INFO - layer3.0.bn1.weight - torch.Size([640]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,148 - mmcv - INFO - layer3.0.bn1.bias - torch.Size([640]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,148 - mmcv - INFO - layer3.0.conv2.weight - torch.Size([640, 640, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:03,148 - mmcv - INFO - layer3.0.bn2.weight - torch.Size([640]): ConstantInit: val=0, bias=0   2024-04-07 20:39:03,148 - mmcv - INFO - layer3.0.bn2.bias - torch.Size([640]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,148 - mmcv - INFO - layer3.0.downsample.0.weight - torch.Size([640, 320, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:03,148 - mmcv - INFO - layer3.0.downsample.1.weight - torch.Size([640]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,148 - mmcv - INFO - layer3.0.downsample.1.bias - torch.Size([640]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,148 - mmcv - INFO - layer3.1.conv1.weight - torch.Size([640, 640, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:03,148 - mmcv - INFO - layer3.1.bn1.weight - torch.Size([640]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,148 - mmcv - INFO - layer3.1.bn1.bias - torch.Size([640]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,148 - mmcv - INFO - layer3.1.conv2.weight - torch.Size([640, 640, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:03,148 - mmcv - INFO - layer3.1.bn2.weight - torch.Size([640]): ConstantInit: val=0, bias=0   2024-04-07 20:39:03,148 - mmcv - INFO - layer3.1.bn2.bias - torch.Size([640]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,173 - mmcv - INFO - initialize SECONDFPN with init_cfg [{'type': 'Kaiming', 'layer': 'ConvTranspose2d'}, {'type': 'Constant', 'layer': 'NaiveSyncBatchNorm2d', 'val': 1.0}] 2024-04-07 20:39:03,195 - mmcv - INFO - deblocks.0.0.weight - torch.Size([160, 64, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:03,195 - mmcv - INFO - deblocks.0.1.weight - torch.Size([64]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:03,195 - mmcv - INFO - deblocks.0.1.bias - torch.Size([64]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:03,195 - mmcv - INFO - deblocks.1.0.weight - torch.Size([160, 64, 2, 2]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:03,195 - mmcv - INFO - deblocks.1.1.weight - torch.Size([64]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:03,195 - mmcv - INFO - deblocks.1.1.bias - torch.Size([64]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:03,195 - mmcv - INFO - deblocks.2.0.weight - torch.Size([320, 64, 4, 4]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:03,195 - mmcv - INFO - deblocks.2.1.weight - torch.Size([64]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:03,195 - mmcv - INFO - deblocks.2.1.bias - torch.Size([64]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:03,195 - mmcv - INFO - deblocks.3.0.weight - torch.Size([640, 64, 8, 8]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:03,196 - mmcv - INFO - deblocks.3.1.weight - torch.Size([64]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:03,196 - mmcv - INFO - deblocks.3.1.bias - torch.Size([64]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:03,377 - mmcv - INFO - initialize ResNet with init_cfg [{'type': 'Kaiming', 'layer': 'Conv2d'}, {'type': 'Constant', 'val': 1, 'layer': ['_BatchNorm', 'GroupNorm']}] 2024-04-07 20:39:03,506 - mmcv - INFO - initialize BasicBlock with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm2'}} 2024-04-07 20:39:03,507 - mmcv - INFO - initialize BasicBlock with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm2'}} 2024-04-07 20:39:03,508 - mmcv - INFO - initialize BasicBlock with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm2'}} 2024-04-07 20:39:03,509 - mmcv - INFO - initialize BasicBlock with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm2'}} 2024-04-07 20:39:03,511 - mmcv - INFO - initialize BasicBlock with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm2'}} 2024-04-07 20:39:03,513 - mmcv - INFO - initialize BasicBlock with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm2'}} 2024-04-07 20:39:03,518 - mmcv - INFO - conv1.weight - torch.Size([160, 160, 7, 7]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:03,518 - mmcv - INFO - bn1.weight - torch.Size([160]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,518 - mmcv - INFO - bn1.bias - torch.Size([160]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,518 - mmcv - INFO - layer1.0.conv1.weight - torch.Size([160, 160, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:03,518 - mmcv - INFO - layer1.0.bn1.weight - torch.Size([160]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,518 - mmcv - INFO - layer1.0.bn1.bias - torch.Size([160]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,518 - mmcv - INFO - layer1.0.conv2.weight - torch.Size([160, 160, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:03,518 - mmcv - INFO - layer1.0.bn2.weight - torch.Size([160]): ConstantInit: val=0, bias=0   2024-04-07 20:39:03,518 - mmcv - INFO - layer1.0.bn2.bias - torch.Size([160]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,518 - mmcv - INFO - layer1.1.conv1.weight - torch.Size([160, 160, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:03,518 - mmcv - INFO - layer1.1.bn1.weight - torch.Size([160]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,518 - mmcv - INFO - layer1.1.bn1.bias - torch.Size([160]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,519 - mmcv - INFO - layer1.1.conv2.weight - torch.Size([160, 160, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:03,519 - mmcv - INFO - layer1.1.bn2.weight - torch.Size([160]): ConstantInit: val=0, bias=0   2024-04-07 20:39:03,519 - mmcv - INFO - layer1.1.bn2.bias - torch.Size([160]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,519 - mmcv - INFO - layer2.0.conv1.weight - torch.Size([320, 160, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:03,519 - mmcv - INFO - layer2.0.bn1.weight - torch.Size([320]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,519 - mmcv - INFO - layer2.0.bn1.bias - torch.Size([320]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,519 - mmcv - INFO - layer2.0.conv2.weight - torch.Size([320, 320, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:03,519 - mmcv - INFO - layer2.0.bn2.weight - torch.Size([320]): ConstantInit: val=0, bias=0   2024-04-07 20:39:03,519 - mmcv - INFO - layer2.0.bn2.bias - torch.Size([320]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,519 - mmcv - INFO - layer2.0.downsample.0.weight - torch.Size([320, 160, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:03,519 - mmcv - INFO - layer2.0.downsample.1.weight - torch.Size([320]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,519 - mmcv - INFO - layer2.0.downsample.1.bias - torch.Size([320]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,519 - mmcv - INFO - layer2.1.conv1.weight - torch.Size([320, 320, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:03,519 - mmcv - INFO - layer2.1.bn1.weight - torch.Size([320]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,519 - mmcv - INFO - layer2.1.bn1.bias - torch.Size([320]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,519 - mmcv - INFO - layer2.1.conv2.weight - torch.Size([320, 320, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:03,519 - mmcv - INFO - layer2.1.bn2.weight - torch.Size([320]): ConstantInit: val=0, bias=0   2024-04-07 20:39:03,519 - mmcv - INFO - layer2.1.bn2.bias - torch.Size([320]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,519 - mmcv - INFO - layer3.0.conv1.weight - torch.Size([640, 320, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:03,519 - mmcv - INFO - layer3.0.bn1.weight - torch.Size([640]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,519 - mmcv - INFO - layer3.0.bn1.bias - torch.Size([640]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,519 - mmcv - INFO - layer3.0.conv2.weight - torch.Size([640, 640, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:03,519 - mmcv - INFO - layer3.0.bn2.weight - torch.Size([640]): ConstantInit: val=0, bias=0   2024-04-07 20:39:03,519 - mmcv - INFO - layer3.0.bn2.bias - torch.Size([640]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,519 - mmcv - INFO - layer3.0.downsample.0.weight - torch.Size([640, 320, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:03,519 - mmcv - INFO - layer3.0.downsample.1.weight - torch.Size([640]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,519 - mmcv - INFO - layer3.0.downsample.1.bias - torch.Size([640]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,519 - mmcv - INFO - layer3.1.conv1.weight - torch.Size([640, 640, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:03,520 - mmcv - INFO - layer3.1.bn1.weight - torch.Size([640]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,520 - mmcv - INFO - layer3.1.bn1.bias - torch.Size([640]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,520 - mmcv - INFO - layer3.1.conv2.weight - torch.Size([640, 640, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:03,520 - mmcv - INFO - layer3.1.bn2.weight - torch.Size([640]): ConstantInit: val=0, bias=0   2024-04-07 20:39:03,520 - mmcv - INFO - layer3.1.bn2.bias - torch.Size([640]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:03,545 - mmcv - INFO - initialize SECONDFPN with init_cfg [{'type': 'Kaiming', 'layer': 'ConvTranspose2d'}, {'type': 'Constant', 'layer': 'NaiveSyncBatchNorm2d', 'val': 1.0}] 2024-04-07 20:39:03,567 - mmcv - INFO - deblocks.0.0.weight - torch.Size([160, 64, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:03,567 - mmcv - INFO - deblocks.0.1.weight - torch.Size([64]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:03,567 - mmcv - INFO - deblocks.0.1.bias - torch.Size([64]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:03,567 - mmcv - INFO - deblocks.1.0.weight - torch.Size([160, 64, 2, 2]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:03,567 - mmcv - INFO - deblocks.1.1.weight - torch.Size([64]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:03,567 - mmcv - INFO - deblocks.1.1.bias - torch.Size([64]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:03,567 - mmcv - INFO - deblocks.2.0.weight - torch.Size([320, 64, 4, 4]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:03,567 - mmcv - INFO - deblocks.2.1.weight - torch.Size([64]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:03,567 - mmcv - INFO - deblocks.2.1.bias - torch.Size([64]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:03,567 - mmcv - INFO - deblocks.3.0.weight - torch.Size([640, 64, 8, 8]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:03,567 - mmcv - INFO - deblocks.3.1.weight - torch.Size([64]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:03,567 - mmcv - INFO - deblocks.3.1.bias - torch.Size([64]): The value is the same before and after calling init_weights of SECONDFPN     /home/m840-02/lsy/P2D-master/p2d/layers/modules/temporal_self_attention.py:111: UserWarning: You'd better set embed_dims in MultiScaleDeformAttention to make the dimension of each attention head a power of 2 which is more efficient in our CUDA implementation.   warnings.warn( Using 16bit native Automatic Mixed Precision (AMP) GPU available: True, used: True TPU available: False, using: 0 TPU cores IPU available: False, using: 0 IPUs HPU available: False, using: 0 HPUs Global seed set to 0 Initializing distributed: GLOBAL_RANK: 0, MEMBER: 1/2 Global seed set to 0 2024-04-07 20:39:36,826 - mmcv - INFO - initialize SECONDFPN with init_cfg [{'type': 'Kaiming', 'layer': 'ConvTranspose2d'}, {'type': 'Constant', 'layer': 'NaiveSyncBatchNorm2d', 'val': 1.0}] 2024-04-07 20:39:36,837 - mmcv - INFO - deblocks.0.0.weight - torch.Size([128, 256, 4, 4]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:36,837 - mmcv - INFO - deblocks.0.1.weight - torch.Size([128]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:36,837 - mmcv - INFO - deblocks.0.1.bias - torch.Size([128]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:36,837 - mmcv - INFO - deblocks.1.0.weight - torch.Size([128, 512, 2, 2]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:36,837 - mmcv - INFO - deblocks.1.1.weight - torch.Size([128]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:36,837 - mmcv - INFO - deblocks.1.1.bias - torch.Size([128]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:36,837 - mmcv - INFO - deblocks.2.0.weight - torch.Size([1024, 128, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:36,837 - mmcv - INFO - deblocks.2.1.weight - torch.Size([128]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:36,837 - mmcv - INFO - deblocks.2.1.bias - torch.Size([128]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:36,837 - mmcv - INFO - deblocks.3.0.weight - torch.Size([2048, 128, 2, 2]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:36,838 - mmcv - INFO - deblocks.3.1.weight - torch.Size([128]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:36,838 - mmcv - INFO - deblocks.3.1.bias - torch.Size([128]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:36,851 - mmcv - INFO - initialize ResNet with init_cfg {'type': 'Pretrained', 'checkpoint': 'torchvision://resnet50'} 2024-04-07 20:39:36,851 - mmcv - INFO - load model from: torchvision://resnet50 2024-04-07 20:39:36,852 - mmcv - INFO - load checkpoint from torchvision path: torchvision://resnet50 2024-04-07 20:39:36,948 - mmcv - WARNING - The model and loaded state dict do not match exactly

unexpected key in source state_dict: fc.weight, fc.bias

2024-04-07 20:39:37,131 - mmcv - INFO - initialize ResNet with init_cfg [{'type': 'Kaiming', 'layer': 'Conv2d'}, {'type': 'Constant', 'val': 1, 'layer': ['_BatchNorm', 'GroupNorm']}] 2024-04-07 20:39:37,254 - mmcv - INFO - initialize BasicBlock with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm2'}} 2024-04-07 20:39:37,255 - mmcv - INFO - initialize BasicBlock with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm2'}} 2024-04-07 20:39:37,257 - mmcv - INFO - initialize BasicBlock with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm2'}} 2024-04-07 20:39:37,258 - mmcv - INFO - initialize BasicBlock with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm2'}} 2024-04-07 20:39:37,259 - mmcv - INFO - initialize BasicBlock with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm2'}} 2024-04-07 20:39:37,262 - mmcv - INFO - initialize BasicBlock with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm2'}} 2024-04-07 20:39:37,266 - mmcv - INFO - conv1.weight - torch.Size([160, 160, 7, 7]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:37,266 - mmcv - INFO - bn1.weight - torch.Size([160]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,266 - mmcv - INFO - bn1.bias - torch.Size([160]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,266 - mmcv - INFO - layer1.0.conv1.weight - torch.Size([160, 160, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:37,266 - mmcv - INFO - layer1.0.bn1.weight - torch.Size([160]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,266 - mmcv - INFO - layer1.0.bn1.bias - torch.Size([160]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,266 - mmcv - INFO - layer1.0.conv2.weight - torch.Size([160, 160, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:37,266 - mmcv - INFO - layer1.0.bn2.weight - torch.Size([160]): ConstantInit: val=0, bias=0   2024-04-07 20:39:37,266 - mmcv - INFO - layer1.0.bn2.bias - torch.Size([160]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,266 - mmcv - INFO - layer1.1.conv1.weight - torch.Size([160, 160, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:37,266 - mmcv - INFO - layer1.1.bn1.weight - torch.Size([160]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,266 - mmcv - INFO - layer1.1.bn1.bias - torch.Size([160]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,267 - mmcv - INFO - layer1.1.conv2.weight - torch.Size([160, 160, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:37,267 - mmcv - INFO - layer1.1.bn2.weight - torch.Size([160]): ConstantInit: val=0, bias=0   2024-04-07 20:39:37,267 - mmcv - INFO - layer1.1.bn2.bias - torch.Size([160]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,267 - mmcv - INFO - layer2.0.conv1.weight - torch.Size([320, 160, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:37,267 - mmcv - INFO - layer2.0.bn1.weight - torch.Size([320]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,267 - mmcv - INFO - layer2.0.bn1.bias - torch.Size([320]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,267 - mmcv - INFO - layer2.0.conv2.weight - torch.Size([320, 320, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:37,267 - mmcv - INFO - layer2.0.bn2.weight - torch.Size([320]): ConstantInit: val=0, bias=0   2024-04-07 20:39:37,267 - mmcv - INFO - layer2.0.bn2.bias - torch.Size([320]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,267 - mmcv - INFO - layer2.0.downsample.0.weight - torch.Size([320, 160, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:37,267 - mmcv - INFO - layer2.0.downsample.1.weight - torch.Size([320]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,267 - mmcv - INFO - layer2.0.downsample.1.bias - torch.Size([320]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,267 - mmcv - INFO - layer2.1.conv1.weight - torch.Size([320, 320, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:37,267 - mmcv - INFO - layer2.1.bn1.weight - torch.Size([320]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,267 - mmcv - INFO - layer2.1.bn1.bias - torch.Size([320]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,267 - mmcv - INFO - layer2.1.conv2.weight - torch.Size([320, 320, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:37,267 - mmcv - INFO - layer2.1.bn2.weight - torch.Size([320]): ConstantInit: val=0, bias=0   2024-04-07 20:39:37,267 - mmcv - INFO - layer2.1.bn2.bias - torch.Size([320]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,267 - mmcv - INFO - layer3.0.conv1.weight - torch.Size([640, 320, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:37,267 - mmcv - INFO - layer3.0.bn1.weight - torch.Size([640]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,267 - mmcv - INFO - layer3.0.bn1.bias - torch.Size([640]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,267 - mmcv - INFO - layer3.0.conv2.weight - torch.Size([640, 640, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:37,267 - mmcv - INFO - layer3.0.bn2.weight - torch.Size([640]): ConstantInit: val=0, bias=0   2024-04-07 20:39:37,267 - mmcv - INFO - layer3.0.bn2.bias - torch.Size([640]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,267 - mmcv - INFO - layer3.0.downsample.0.weight - torch.Size([640, 320, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:37,267 - mmcv - INFO - layer3.0.downsample.1.weight - torch.Size([640]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,267 - mmcv - INFO - layer3.0.downsample.1.bias - torch.Size([640]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,267 - mmcv - INFO - layer3.1.conv1.weight - torch.Size([640, 640, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:37,268 - mmcv - INFO - layer3.1.bn1.weight - torch.Size([640]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,268 - mmcv - INFO - layer3.1.bn1.bias - torch.Size([640]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,268 - mmcv - INFO - layer3.1.conv2.weight - torch.Size([640, 640, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:37,268 - mmcv - INFO - layer3.1.bn2.weight - torch.Size([640]): ConstantInit: val=0, bias=0   2024-04-07 20:39:37,268 - mmcv - INFO - layer3.1.bn2.bias - torch.Size([640]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,292 - mmcv - INFO - initialize SECONDFPN with init_cfg [{'type': 'Kaiming', 'layer': 'ConvTranspose2d'}, {'type': 'Constant', 'layer': 'NaiveSyncBatchNorm2d', 'val': 1.0}] 2024-04-07 20:39:37,314 - mmcv - INFO - deblocks.0.0.weight - torch.Size([160, 64, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:37,314 - mmcv - INFO - deblocks.0.1.weight - torch.Size([64]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:37,314 - mmcv - INFO - deblocks.0.1.bias - torch.Size([64]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:37,314 - mmcv - INFO - deblocks.1.0.weight - torch.Size([160, 64, 2, 2]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:37,314 - mmcv - INFO - deblocks.1.1.weight - torch.Size([64]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:37,314 - mmcv - INFO - deblocks.1.1.bias - torch.Size([64]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:37,314 - mmcv - INFO - deblocks.2.0.weight - torch.Size([320, 64, 4, 4]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:37,314 - mmcv - INFO - deblocks.2.1.weight - torch.Size([64]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:37,314 - mmcv - INFO - deblocks.2.1.bias - torch.Size([64]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:37,314 - mmcv - INFO - deblocks.3.0.weight - torch.Size([640, 64, 8, 8]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:37,314 - mmcv - INFO - deblocks.3.1.weight - torch.Size([64]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:37,314 - mmcv - INFO - deblocks.3.1.bias - torch.Size([64]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:37,496 - mmcv - INFO - initialize ResNet with init_cfg [{'type': 'Kaiming', 'layer': 'Conv2d'}, {'type': 'Constant', 'val': 1, 'layer': ['_BatchNorm', 'GroupNorm']}] 2024-04-07 20:39:37,627 - mmcv - INFO - initialize BasicBlock with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm2'}} 2024-04-07 20:39:37,628 - mmcv - INFO - initialize BasicBlock with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm2'}} 2024-04-07 20:39:37,629 - mmcv - INFO - initialize BasicBlock with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm2'}} 2024-04-07 20:39:37,630 - mmcv - INFO - initialize BasicBlock with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm2'}} 2024-04-07 20:39:37,632 - mmcv - INFO - initialize BasicBlock with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm2'}} 2024-04-07 20:39:37,634 - mmcv - INFO - initialize BasicBlock with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm2'}} 2024-04-07 20:39:37,639 - mmcv - INFO - conv1.weight - torch.Size([160, 160, 7, 7]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:37,639 - mmcv - INFO - bn1.weight - torch.Size([160]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,639 - mmcv - INFO - bn1.bias - torch.Size([160]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,639 - mmcv - INFO - layer1.0.conv1.weight - torch.Size([160, 160, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:37,639 - mmcv - INFO - layer1.0.bn1.weight - torch.Size([160]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,639 - mmcv - INFO - layer1.0.bn1.bias - torch.Size([160]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,639 - mmcv - INFO - layer1.0.conv2.weight - torch.Size([160, 160, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:37,639 - mmcv - INFO - layer1.0.bn2.weight - torch.Size([160]): ConstantInit: val=0, bias=0   2024-04-07 20:39:37,639 - mmcv - INFO - layer1.0.bn2.bias - torch.Size([160]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,639 - mmcv - INFO - layer1.1.conv1.weight - torch.Size([160, 160, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:37,639 - mmcv - INFO - layer1.1.bn1.weight - torch.Size([160]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,639 - mmcv - INFO - layer1.1.bn1.bias - torch.Size([160]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,639 - mmcv - INFO - layer1.1.conv2.weight - torch.Size([160, 160, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:37,639 - mmcv - INFO - layer1.1.bn2.weight - torch.Size([160]): ConstantInit: val=0, bias=0   2024-04-07 20:39:37,639 - mmcv - INFO - layer1.1.bn2.bias - torch.Size([160]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,639 - mmcv - INFO - layer2.0.conv1.weight - torch.Size([320, 160, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:37,640 - mmcv - INFO - layer2.0.bn1.weight - torch.Size([320]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,640 - mmcv - INFO - layer2.0.bn1.bias - torch.Size([320]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,640 - mmcv - INFO - layer2.0.conv2.weight - torch.Size([320, 320, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:37,640 - mmcv - INFO - layer2.0.bn2.weight - torch.Size([320]): ConstantInit: val=0, bias=0   2024-04-07 20:39:37,640 - mmcv - INFO - layer2.0.bn2.bias - torch.Size([320]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,640 - mmcv - INFO - layer2.0.downsample.0.weight - torch.Size([320, 160, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:37,640 - mmcv - INFO - layer2.0.downsample.1.weight - torch.Size([320]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,640 - mmcv - INFO - layer2.0.downsample.1.bias - torch.Size([320]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,640 - mmcv - INFO - layer2.1.conv1.weight - torch.Size([320, 320, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:37,640 - mmcv - INFO - layer2.1.bn1.weight - torch.Size([320]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,640 - mmcv - INFO - layer2.1.bn1.bias - torch.Size([320]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,640 - mmcv - INFO - layer2.1.conv2.weight - torch.Size([320, 320, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:37,640 - mmcv - INFO - layer2.1.bn2.weight - torch.Size([320]): ConstantInit: val=0, bias=0   2024-04-07 20:39:37,640 - mmcv - INFO - layer2.1.bn2.bias - torch.Size([320]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,640 - mmcv - INFO - layer3.0.conv1.weight - torch.Size([640, 320, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:37,640 - mmcv - INFO - layer3.0.bn1.weight - torch.Size([640]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,640 - mmcv - INFO - layer3.0.bn1.bias - torch.Size([640]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,640 - mmcv - INFO - layer3.0.conv2.weight - torch.Size([640, 640, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:37,640 - mmcv - INFO - layer3.0.bn2.weight - torch.Size([640]): ConstantInit: val=0, bias=0   2024-04-07 20:39:37,640 - mmcv - INFO - layer3.0.bn2.bias - torch.Size([640]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,640 - mmcv - INFO - layer3.0.downsample.0.weight - torch.Size([640, 320, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:37,640 - mmcv - INFO - layer3.0.downsample.1.weight - torch.Size([640]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,640 - mmcv - INFO - layer3.0.downsample.1.bias - torch.Size([640]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,640 - mmcv - INFO - layer3.1.conv1.weight - torch.Size([640, 640, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:37,640 - mmcv - INFO - layer3.1.bn1.weight - torch.Size([640]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,640 - mmcv - INFO - layer3.1.bn1.bias - torch.Size([640]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,640 - mmcv - INFO - layer3.1.conv2.weight - torch.Size([640, 640, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:37,640 - mmcv - INFO - layer3.1.bn2.weight - torch.Size([640]): ConstantInit: val=0, bias=0   2024-04-07 20:39:37,640 - mmcv - INFO - layer3.1.bn2.bias - torch.Size([640]): The value is the same before and after calling init_weights of ResNet     2024-04-07 20:39:37,666 - mmcv - INFO - initialize SECONDFPN with init_cfg [{'type': 'Kaiming', 'layer': 'ConvTranspose2d'}, {'type': 'Constant', 'layer': 'NaiveSyncBatchNorm2d', 'val': 1.0}] 2024-04-07 20:39:37,688 - mmcv - INFO - deblocks.0.0.weight - torch.Size([160, 64, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:37,688 - mmcv - INFO - deblocks.0.1.weight - torch.Size([64]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:37,688 - mmcv - INFO - deblocks.0.1.bias - torch.Size([64]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:37,688 - mmcv - INFO - deblocks.1.0.weight - torch.Size([160, 64, 2, 2]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:37,688 - mmcv - INFO - deblocks.1.1.weight - torch.Size([64]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:37,688 - mmcv - INFO - deblocks.1.1.bias - torch.Size([64]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:37,688 - mmcv - INFO - deblocks.2.0.weight - torch.Size([320, 64, 4, 4]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:37,688 - mmcv - INFO - deblocks.2.1.weight - torch.Size([64]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:37,688 - mmcv - INFO - deblocks.2.1.bias - torch.Size([64]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:37,688 - mmcv - INFO - deblocks.3.0.weight - torch.Size([640, 64, 8, 8]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0   2024-04-07 20:39:37,688 - mmcv - INFO - deblocks.3.1.weight - torch.Size([64]): The value is the same before and after calling init_weights of SECONDFPN     2024-04-07 20:39:37,688 - mmcv - INFO - deblocks.3.1.bias - torch.Size([64]): The value is the same before and after calling init_weights of SECONDFPN     /home/m840-02/lsy/P2D-master/p2d/layers/modules/temporal_self_attention.py:111: UserWarning: You'd better set embed_dims in MultiScaleDeformAttention to make the dimension of each attention head a power of 2 which is more efficient in our CUDA implementation.   warnings.warn( Global seed set to 0 Initializing distributed: GLOBAL_RANK: 1, MEMBER: 2/2

distributed_backend=nccl All distributed processes registered. Starting with 2 processes

Missing logger folder: ./outputs/p2d_deform_lss_r50_256x704_128x128_24e_3key_ema/lightning_logs Missing logger folder: ./outputs/p2d_deform_lss_r50_256x704_128x128_24e_3key_ema/lightning_logs LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1] LOCAL_RANK: 1 - CUDA_VISIBLE_DEVICES: [0,1]

  | Name  | Type | Params

0 | model | P2D  | 100.0 M

100.0 M   Trainable params 9.5 K     Non-trainable params 100.0 M   Total params 199.943   Total estimated model params size (MB) Epoch 0:   0%|          | 0/1946 [00:00<?, ?it/s]/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at  /pytorch/c10/core/TensorImpl.h:1156.)   return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode) /home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at  /pytorch/c10/core/TensorImpl.h:1156.)   return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode) /home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/torch/_tensor.py:575: UserWarning: floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at  /pytorch/aten/src/ATen/native/BinaryOps.cpp:467.)   return torch.floor_divide(self, other) /home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/torch/_tensor.py:575: UserWarning: floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at  /pytorch/aten/src/ATen/native/BinaryOps.cpp:467.)   return torch.floor_divide(self, other) /home/m840-02/lsy/P2D-master/p2d/layers/heads/bev_depth_head.py:367: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requiresgrad(True), rather than tensor.new_tensor(sourceTensor).   num = torch.clamp(reduce_mean(target_box.new_tensor(num)), /home/m840-02/lsy/P2D-master/p2d/layers/heads/bev_depth_head.py:367: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requiresgrad(True), rather than tensor.new_tensor(sourceTensor).   num = torch.clamp(reduce_mean(target_box.new_tensor(num)), /home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/plugins/precision/precision_plugin.py:212: FutureWarning: Non-finite norm encountered in torch.nn.utils.clip_gradnorm; continuing anyway. Note that the default behavior will change in a future release to error out if a non-finite total norm is encountered. At that point, setting error_if_nonfinite=false will be required to retain the old behavior.   torch.nn.utils.clip_gradnorm(parameters, clip_val) /home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/plugins/precision/precision_plugin.py:212: FutureWarning: Non-finite norm encountered in torch.nn.utils.clip_gradnorm; continuing anyway. Note that the default behavior will change in a future release to error out if a non-finite total norm is encountered. At that point, setting error_if_nonfinite=false will be required to retain the old behavior.   torch.nn.utils.clip_gradnorm(parameters, clip_val) Epoch 0:  90%|█████████ | 1758/1946 [54:48<05:51,  1.87s/it, loss=638, v_num=0] Validation: 0it [00:00, ?it/s] Validation DataLoader 0:   0%|          | 0/188 [00:06<?, ?it/s]/home/m840-02/lsy/P2D-master/mmdetection3d/mmdet3d/core/bbox/coders/centerpoint_bbox_coders.py:201: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requiresgrad(True), rather than torch.tensor(sourceTensor).   self.post_center_range = torch.tensor( /home/m840-02/lsy/P2D-master/mmdetection3d/mmdet3d/core/bbox/coders/centerpoint_bbox_coders.py:201: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requiresgrad(True), rather than torch.tensor(sourceTensor).   self.post_center_range = torch.tensor( /home/m840-02/lsy/P2D-master/mmdetection3d/mmdet3d/core/bbox/coders/centerpoint_bbox_coders.py:201: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requiresgrad(True), rather than torch.tensor(sourceTensor).   self.post_center_range = torch.tensor( /home/m840-02/lsy/P2D-master/mmdetection3d/mmdet3d/core/bbox/coders/centerpoint_bbox_coders.py:201: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requiresgrad(True), rather than torch.tensor(sourceTensor).   self.post_center_range = torch.tensor(

Epoch 0:  90%|█████████ | 1759/1946 [55:11<05:52,  1.88s/it, loss=638, v_num=0] Epoch 0:  90%|█████████ | 1760/1946 [55:12<05:50,  1.88s/it, loss=638, v_num=0]

Epoch 0: 100%|██████████| 1946/1946 [59:06<00:00,  1.82s/it, loss=638, v_num=0] Formating bboxes of img_bbox Start to convert detection format... [>>>>>>>>>>>>>>>>>>>>>>>>>>>] 3008/3008, 317.6 task/s, elapsed: 9s, ETA:     0s Results writes to ./outputs/p2d_deform_lss_r50_256x704_128x128_24e_3key_ema/results_nusc.json Evaluating bboxes of img_bbox   0%|          | 0/6019 [00:00<?, ?it/s]  97%|█████████▋| 5854/6019 [00:09<00:00, 1292.88it/s]

                                                     3008 6019 Traceback (most recent call last):   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 722, in _call_and_handle_interrupt     return self.strategy.launcher.launch(trainer_fn, *args, trainer=self, kwargs)   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/strategies/launchers/subprocess_script.py", line 93, in launch     return function(*args, *kwargs)   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 812, in _fit_impl     results = self._run(model, ckpt_path=self.ckpt_path)   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1237, in _run     results = self._run_stage()   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1327, in _run_stage     return self._run_train()   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1357, in _run_train     self.fit_loop.run()   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/loops/base.py", line 204, in run     self.advance(args, kwargs)   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/loops/fit_loop.py", line 269, in advance     self._outputs = self.epoch_loop.run(self._data_fetcher)   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/loops/base.py", line 205, in run     self.on_advance_end()   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 255, in on_advance_end     self._run_validation()   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 309, in _run_validation     self.val_loop.run()   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/loops/base.py", line 211, in run     output = self.on_run_end()   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 187, in on_run_end     self._evaluation_epoch_end(self._outputs)   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 309, in _evaluation_epoch_end     self.trainer._call_lightning_module_hook("validation_epoch_end", output_or_outputs)   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1599, in _call_lightning_module_hook     output = fn(*args, **kwargs)   File "/home/m840-02/lsy/P2D-master/p2d/exps/nuscenes/base_exp.py", line 372, in validation_epoch_end     self.evaluator.evaluate(all_pred_results, all_img_metas)   File "/home/m840-02/lsy/P2D-master/p2d/evaluators/det_evaluators.py", line 212, in evaluate     self._evaluate_single(result_files[name])   File "/home/m840-02/lsy/P2D-master/p2d/evaluators/det_evaluators.py", line 90, in _evaluate_single     nusc_eval = NuScenesEval(nusc,   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/nuscenes/eval/detection/evaluate.py", line 85, in init     assert set(self.pred_boxes.sample_tokens) == set(self.gt_boxes.sample_tokens), \ AssertionError: Samples in split doesn't match samples in predictions.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):   File "/home/m840-02/lsy/P2D-master/p2d/exps/nuscenes/p2d/p2d_deform_lss_r50_256x704_128x128_24e_3key_ema.py", line 10, in <module>     run_cli(P2DLightningModel,   File "/home/m840-02/lsy/P2D-master/p2d/exps/base_cli.py", line 89, in run_cli     trainer.fit(model)   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 771, in fit     self._call_and_handle_interrupt(   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 737, in _call_and_handle_interrupt     self.strategy.reconciliate_processes(traceback.format_exc())   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/strategies/ddp.py", line 446, in reconciliate_processes     raise DeadlockDetectedExcep

sanmin0312 commented 6 months ago

Thank you for providing the details. To address it, please modify the value of 'limit_val_batches' from 0.5 to 1.0 in line 42 of ./p2d/exps/base_cli.py file. This change will ensure that the model generates prediction results for the entire validation set. Alternatively, you can disable the evaluation per epoch by changing the value of 'check_val_every_n_epoch' to 0 in line 39 of ./p2d/exps/base_cli.py file.

lsy19103118 commented 6 months ago

Thank you for your valuable suggestions!  I have modified the code according to your suggestion, but there are still new problems, so I have to consult you again.

I modified base_cli.py:  parser.set_defaults(profiler='simple',                         deterministic=False,                         max_epochs=24,                         strategy='ddp_find_unused_parameters_false',                         check_val_every_n_epoch=1,  #                         num_sanity_val_steps=0,                         gradient_clip_val=5,                         limit_val_batches=1,  #                         enable_checkpointing=True,                         precision=16,                         default_root_dir=os.path.join('./outputs/', exp_name))

I ran the following four commands:

1: python ./exps/nuscenes/p2d/p2d_deform_lss_r50_256x704_128x128_24e_3key_ema.py --amp_backend native - b 8 --gpus  2 (nuscenes_infos_trainval.pkl)

the following error message: Epoch 0: 100%|██████████| 1759/1759 [55:11<00:00,  1.88s/it, loss=35.4, v_num=2] Formating bboxes of img_bbox Start to convert detection format... [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 16/16, 46.8 task/s, elapsed: 0s, ETA:     0s Results writes to ./outputs/p2d_deform_lss_r50_256x704_128x128_24e_3key_ema/results_nusc.json Evaluating bboxes of img_bbox

98%|█████████▊| 5903/6019 [00:21<00:00, 1364.51it/s] 16 6019 Traceback (most recent call last):   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 722, in _call_and_handle_interrupt     return self.strategy.launcher.launch(trainer_fn, *args, trainer=self, kwargs)   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/strategies/launchers/subprocess_script.py", line 93, in launch     return function(*args, *kwargs)   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 812, in _fit_impl     results = self._run(model, ckpt_path=self.ckpt_path)   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1237, in _run     results = self._run_stage()   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1327, in _run_stage     return self._run_train()   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1357, in _run_train     self.fit_loop.run()   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/loops/base.py", line 204, in run     self.advance(args, kwargs)   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/loops/fit_loop.py", line 269, in advance     self._outputs = self.epoch_loop.run(self._data_fetcher)   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/loops/base.py", line 205, in run     self.on_advance_end()   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 255, in on_advance_end     self._run_validation()   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 309, in _run_validation     self.val_loop.run()   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/loops/base.py", line 211, in run     output = self.on_run_end()   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 187, in on_run_end     self._evaluation_epoch_end(self._outputs)   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 309, in _evaluation_epoch_end     self.trainer._call_lightning_module_hook("validation_epoch_end", output_or_outputs)   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1599, in _call_lightning_module_hook     output = fn(*args, **kwargs)   File "/home/m840-02/lsy/P2D-master/p2d/exps/nuscenes/base_exp.py", line 372, in validation_epoch_end     self.evaluator.evaluate(all_pred_results, all_img_metas)   File "/home/m840-02/lsy/P2D-master/p2d/evaluators/det_evaluators.py", line 212, in evaluate     self._evaluate_single(result_files[name])   File "/home/m840-02/lsy/P2D-master/p2d/evaluators/det_evaluators.py", line 90, in _evaluate_single     nusc_eval = NuScenesEval(nusc,   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/nuscenes/eval/detection/evaluate.py", line 85, in init     assert set(self.pred_boxes.sample_tokens) == set(self.gt_boxes.sample_tokens), \ AssertionError: Samples in split doesn't match samples in predictions.

2: python ./exps/nuscenes/p2d/p2d_deform_lss_r50_256x704_128x128_24e_3key.py --ckpt_path  weights/P2D_res50.pth - b 8  --gpus 2 (nuscenes_infos_trainval.pkl) the following error message: Traceback (most recent call last):   File "/home/m840-02/lsy/P2D-master/p2d/exps/nuscenes/p2d/p2d_deform_lss_r50_256x704_128x128_24e_3key.py", line 84, in <module>     run_cli(P2DLightningModel,   File "/home/m840-02/lsy/P2D-master/p2d/exps/base_cli.py", line 89, in run_cli     trainer.fit(model)   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 771, in fit     self._call_and_handle_interrupt(   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 724, in _call_and_handle_interrupt     return trainer_fn(*args, kwargs)   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 812, in _fit_impl     results = self._run(model, ckpt_path=self.ckpt_path)   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1237, in _run     results = self._run_stage()   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1327, in _run_stage     return self._run_train()   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1357, in _run_train     self.fit_loop.run()   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/loops/base.py", line 203, in run     self.on_advance_start(*args, *kwargs)   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/loops/fit_loop.py", line 254, in on_advance_start     self.trainer._call_callback_hooks("on_train_epoch_start")   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1640, in _call_callback_hooks     fn(self, self.lightning_module, args, kwargs)   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/callbacks/progress/tqdm_progress.py", line 264, in on_train_epoch_start     total_val_batches = self.total_val_batches   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/callbacks/progress/base.py", line 168, in total_val_batches     return sum(self.trainer.num_val_batches) if self._trainer.fit_loop.epoch_loop._should_check_val_epoch() else 0   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 506, in _should_check_val_epoch     and (self.trainer.current_epoch + 1) % self.trainer.check_val_every_n_epoch == 0 ZeroDivisionError: integer division or modulo by zero Training: 0it [00:10, ?it/s]

parser.set_defaults(profiler='simple',                         deterministic=False,                         max_epochs=24,                         strategy='ddp_find_unused_parameters_false',                         check_val_every_n_epoch=0,  #                         num_sanity_val_steps=0,                         gradient_clip_val=5,                         limit_val_batches=0.5,  #                         enable_checkpointing=True,                         precision=16,                         default_root_dir=os.path.join('./outputs/', exp_name))

I ran the following four commands: 1: python ./exps/nuscenes/p2d/p2d_deform_lss_r50_256x704_128x128_24e_3key_ema.py --amp_backend native - b 8 --gpus  2 (nuscenes_infos_trainval.pkl) Training: 0it [00:00, ?it/s]Traceback (most recent call last):   File "/home/m840-02/lsy/P2D-master/p2d/exps/nuscenes/p2d/p2d_deform_lss_r50_256x704_128x128_24e_3key_ema.py", line 10, in <module>     run_cli(P2DLightningModel,   File "/home/m840-02/lsy/P2D-master/p2d/exps/base_cli.py", line 89, in run_cli     trainer.fit(model)   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 771, in fit     self._call_and_handle_interrupt(   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 722, in _call_and_handle_interrupt     return self.strategy.launcher.launch(trainer_fn, *args, trainer=self, kwargs)   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/strategies/launchers/subprocess_script.py", line 93, in launch     return function(*args, *kwargs)   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 812, in _fit_impl     results = self._run(model, ckpt_path=self.ckpt_path)   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1237, in _run     results = self._run_stage()   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1327, in _run_stage     return self._run_train()   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1357, in _run_train     self.fit_loop.run()   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/loops/base.py", line 203, in run     self.on_advance_start(args, kwargs)   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/loops/fit_loop.py", line 254, in on_advance_start     self.trainer._call_callback_hooks("on_train_epoch_start")   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1640, in _call_callback_hooks     fn(self, self.lightning_module, *args, **kwargs)   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/callbacks/progress/tqdm_progress.py", line 264, in on_train_epoch_start     total_val_batches = self.total_val_batches   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/callbacks/progress/base.py", line 168, in total_val_batches     return sum(self.trainer.num_val_batches) if self._trainer.fit_loop.epoch_loop._should_check_val_epoch() else 0   File "/home/m840-02/anaconda3/envs/p2d/lib/python3.8/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 506, in _should_check_val_epoch     and (self.trainer.current_epoch + 1) % self.trainer.check_val_every_n_epoch == 0 ZeroDivisionError: integer division or modulo by zero

2: python ./exps/nuscenes/p2d/p2d_deform_lss_r50_256x704_128x128_24e_3key.py --ckpt_path  weights/P2D_res50.pth - b 8  --gpus 2 (nuscenes_infos_trainval.pkl)

------------------ 原始邮件 ------------------ 发件人: "sanmin0312/P2D" @.>; 发送时间: 2024年4月9日(星期二) 下午3:33 @.>; @.**@.>; 主题: Re: [sanmin0312/P2D] assert set(self.pred_boxes.sample_tokens) == set(self.gt_boxes.sample_tokens), \ AssertionError: Samples in split doesn't match samples in predictions. (Issue #3)

Thank you for providing the details. To address it, please modify the value of 'limit_val_batches' from 0.5 to 1.0 in line 42 of ./p2d/exps/base_cli.py file. This change will ensure that the model generates prediction results for the entire validation set. Alternatively, you can disable the evaluation per epoch by changing the value of 'check_val_every_n_epoch' to 0 in line 39 of ./p2d/exps/base_cli.py file.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

sanmin0312 commented 6 months ago

Sorry for the confusion.

To turn off the evaluation per epoch, please comment out 'check_val_every_n_epoch=1' in line 39 and set 'limit_val_batches' to 0 in line 42 of ./p2d/exps/base_cli.py file.

lsy19103118 commented 6 months ago

Thank you for your valuable advice!  The model is ready for normal training.  But for the trained model, how can I verify it to get the evaluation indicators of the model under the nuScenes validation set/test set (mAP, NDS, mASE, mAOE, mAVE, mAAE)?

------------------ 原始邮件 ------------------ 发件人: "sanmin0312/P2D" @.>; 发送时间: 2024年4月10日(星期三) 下午3:53 @.>; @.**@.>; 主题: Re: [sanmin0312/P2D] assert set(self.pred_boxes.sample_tokens) == set(self.gt_boxes.sample_tokens), \ AssertionError: Samples in split doesn't match samples in predictions. (Issue #3)

Sorry for the confusion.

To turn off the evaluation per epoch, please comment out 'check_val_every_n_epoch=1' in line 39 and set 'limit_val_batches' to 0 in line 42 of ./p2d/exps/base_cli.py file.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>