error in loading coco train2017 and val2017

Dear author,

I have a question regarding my training process. It seems that my code isn't correctly loading the train2017 and val2017 datasets from COCO. It appears that even during inference, it's loading train2017, which is causing errors. Additionally, it seems to only load 107,761 images on its own, and I'm unsure where this setting is configured. Could you please advise me on which Python code to modify to resolve this issue?

Best regards,

this is running error“ D:\anconda\envs\zhl310\python.exe D:/devit-main/devit-main/train.py task=ovd, vit=l, dataset=coco, shot=10, split=1, num_gpus=1 Running command: python D:/devit-main/devit-main/tools/train_net.py --num-gpus 1 --config-file D:/devit-main/devit-main/configs/open-vocabulary/coco/vitl.yaml MODEL.WEIGHTS D:/devit-main/devit-main/weights/initial/open-vocabulary/vitl+rpn.pth DE.OFFLINE_RPN_CONFIG D:/devit-main/devit-main/configs/RPN/mask_rcnn_R_50_C4_1x_ovd_FSD.yaml OUTPUT_DIR D:/devit-main/devit-main/output/train/open-vocabulary/coco/vitl/ xFormers not available Command Line Args: Namespace(config_file='D:/devit-main/devit-main/configs/open-vocabulary/coco/vitl.yaml', resume=False, eval_only=False, num_gpus=1, num_machines=1, machine_rank=0, dist_url='auto', opts=['MODEL.WEIGHTS', 'D:/devit-main/devit-main/weights/initial/open-vocabulary/vitl+rpn.pth', 'DE.OFFLINE_RPN_CONFIG', 'D:/devit-main/devit-main/configs/RPN/mask_rcnn_R_50_C4_1x_ovd_FSD.yaml', 'OUTPUT_DIR', 'D:/devit-main/devit-main/output/train/open-vocabulary/coco/vitl/']) [07/28 11:38:28 detectron2]: Rank of current process: 0. World size: 1 [07/28 11:38:28 detectron2]: Environment info:

sys.platform win32 Python 3.10.14	packaged by Anaconda, Inc.	(main, May 6 2024, 19:44:50) [MSC v.1916 64 bit (AMD64)] numpy 1.24.4 detectron2 RegionCLIP @D:\devit-main\devit-main\tools..\detectron2 Compiler MSVC 194033812 CUDA compiler not available DETECTRON2_ENV_MODULE PyTorch 1.13.1+cu117 @D:\anconda\envs\zhl310\lib\site-packages\torch PyTorch debug build False GPU available True GPU 0 NVIDIA GeForce RTX 4060 Ti (arch=8.9) CUDA_HOME None - invalid! Pillow 9.5.0 torchvision 0.14.1+cu117 @D:\anconda\envs\zhl310\lib\site-packages\torchvision torchvision arch flags D:\anconda\envs\zhl310\lib\site-packages\torchvision_C.pyd fvcore 0.1.5.post20221221 iopath 0.1.10 cv2 Not found

PyTorch built with:

C++ Version: 199711
MSVC 192829337
Intel(R) Math Kernel Library Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications
Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815)
OpenMP 2019
LAPACK is enabled (usually provided by MKL)
CPU capability usage: AVX2
CUDA Runtime 11.7
NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
CuDNN 8.5
Magma 2.5.4
Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.7, CUDNN_VERSION=8.5.0, CXX_COMPILER=C:/actions-runner/_work/pytorch/pytorch/builder/windows/tmp_bin/sccache-cl.exe, CXX_FLAGS=/DWIN32 /D_WINDOWS /GR /EHsc /w /bigobj -DUSE_PTHREADPOOL -openmp:experimental -IC:/actions-runner/_work/pytorch/pytorch/builder/windows/mkl/include -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DUSE_FBGEMM -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.13.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=OFF, USE_NNPACK=OFF, USE_OPENMP=ON, USE_ROCM=OFF,

[07/28 11:38:28 detectron2]: Command line arguments: Namespace(config_file='D:/devit-main/devit-main/configs/open-vocabulary/coco/vitl.yaml', resume=False, eval_only=False, num_gpus=1, num_machines=1, machine_rank=0, dist_url='auto', opts=['MODEL.WEIGHTS', 'D:/devit-main/devit-main/weights/initial/open-vocabulary/vitl+rpn.pth', 'DE.OFFLINE_RPN_CONFIG', 'D:/devit-main/devit-main/configs/RPN/mask_rcnn_R_50_C4_1x_ovd_FSD.yaml', 'OUTPUT_DIR', 'D:/devit-main/devit-main/output/train/open-vocabulary/coco/vitl/']) [07/28 11:38:28 detectron2]: Contents of args.config_file=D:/devit-main/devit-main/configs/open-vocabulary/coco/vitl.yaml: BASE: "../../Base-RCNN-C4.yaml" DE: CLASS_PROTOTYPES: "D:/devit-main/devit-main/weights/initial/open-vocabulary/prototypes/coco/class_prototypes_base.vitl14.pth,D:/devit-main/devit-main/weights/initial/open-vocabulary/prototypes/coco/class_prototypes_novel.vitl14.pth" BG_PROTOTYPES: "D:/devit-main/devit-main/weights/initial/background/background_prototypes.vitl14.pth" BG_CLS_LOSS_WEIGHT: 0.2 TOPK: 3

MODEL: META_ARCHITECTURE: "OpenSetDetectorWithExamples" BACKBONE: NAME: "build_dino_v2_vit" TYPE: "large" WEIGHTS: "" MASK_ON: False RPN: HEAD_NAME: StandardRPNHead IN_FEATURES: ["res4"] ROI_HEADS: SCORE_THRESH_TEST: 0.001 ROI_BOX_HEAD: NAME: "" NUM_FC: 0 POOLER_RESOLUTION: 7 CLS_AGNOSTIC_BBOX_REG: True PIXEL_MEAN: [0.48145466, 0.4578275, 0.40821073] PIXEL_STD: [0.26862954, 0.26130258, 0.27577711]

DATASETS: TRAIN: ("coco_2017_ovd_b_train",) TEST: ("coco_2017_ovd_all_test",) TEST: EVAL_PERIOD: 20 SOLVER: IMS_PER_BATCH: 1 BASE_LR: 0.002 STEPS: (60000, 80000) MAX_ITER: 90000 WARMUP_ITERS: 5000 CHECKPOINT_PERIOD: 20

INPUT: MIN_SIZE_TRAIN_SAMPLING: choice MIN_SIZE_TRAIN: (640, 672, 704, 736, 768, 800) MAX_SIZE_TRAIN: 1333 MIN_SIZE_TEST: 800 MAX_SIZE_TEST: 1333 FORMAT: "RGB"

[07/28 11:38:28 detectron2]: Full config saved to D:/devit-main/devit-main/output/train/open-vocabulary/coco/vitl/config.yaml [07/28 11:38:29 d2.utils.env]: Using a generated random seed 29077868 ('coco_2017_ovd_all_test',) [07/28 11:38:40 d2.data.datasets.coco]: Loading D:\devit-main\devit-main\datasets\coco\annotations\ovd_ins_train2017_b.json takes 8.64 seconds. [07/28 11:38:40 d2.data.datasets.coco]: Loaded 107761 images in COCO format from D:\devit-main\devit-main\datasets\coco\annotations\ovd_ins_train2017_b.json [07/28 11:38:43 d2.data.build]: Removed 0 images with no usable annotations. 107761 images left. [07/28 11:38:43 d2.data.dataset_mapper]: [DatasetMapper] Augmentations used in training: [ResizeShortestEdge(short_edge_length=(640, 672, 704, 736, 768, 800), max_size=1333, sample_style='choice'), RandomFlip()] [07/28 11:38:43 d2.data.build]: Using training sampler TrainingSampler [07/28 11:38:43 d2.data.common]: Serializing 107761 elements to byte tensors and concatenating them all ... [07/28 11:38:45 d2.data.common]: Serialized dataset takes 361.37 MiB [07/28 11:38:45 fvcore.common.checkpoint]: [Checkpointer] Loading from D:/devit-main/devit-main/weights/initial/open-vocabulary/vitl+rpn.pth ... WARNING [07/28 11:38:46 fvcore.common.checkpoint]: Some model parameters or buffers are not found in the checkpoint: bg_cnn.class_proj.{bias, weight} bg_cnn.main_layers.0.0.{bias, weight} bg_cnn.main_layers.0.1.{bias, running_mean, running_var, weight} bg_cnn.main_layers.1.0.{bias, weight} bg_cnn.main_layers.1.1.{bias, running_mean, running_var, weight} bg_cnn.main_layers.2.0.{bias, weight} bg_cnn.main_layers.2.1.{bias, running_mean, running_var, weight} bg_cnn.mask_layers.0.{bias, weight} bg_cnn.mask_layers.1.{bias, weight} bg_cnn.mask_layers.2.{bias, weight} bg_tokens fc_back_class.{bias, weight} fc_bg_class.{bias, weight} fc_intra_class.{bias, weight} fc_other_class.{bias, weight} per_cls_cnn.class_proj.{bias, weight} per_cls_cnn.main_layers.0.0.{bias, weight} per_cls_cnn.main_layers.0.1.{bias, running_mean, running_var, weight} per_cls_cnn.main_layers.1.0.{bias, weight} per_cls_cnn.main_layers.1.1.{bias, running_mean, running_var, weight} per_cls_cnn.main_layers.2.0.{bias, weight} per_cls_cnn.main_layers.2.1.{bias, running_mean, running_var, weight} per_cls_cnn.mask_layers.0.{bias, weight} per_cls_cnn.mask_layers.1.{bias, weight} per_cls_cnn.mask_layers.2.{bias, weight} r2c.{pool_h, pool_w, pos_x, pos_y} reg_bg_dist_emb.{bias, weight} reg_intra_dist_emb.{bias, weight} rp1.0.{bias, weight} rp1.1.{bias, running_mean, running_var, weight} rp1_out.{bias, weight} rp2.0.{bias, weight} rp2.1.{bias, running_mean, running_var, weight} rp2_out.{bias, weight} rp3.0.{bias, weight} rp3.1.{bias, running_mean, running_var, weight} rp3_out.{bias, weight} rp4.0.{bias, weight} rp4.1.{bias, running_mean, running_var, weight} rp4_out.{bias, weight} rp5.0.{bias, weight} rp5.1.{bias, running_mean, running_var, weight} rp5_out.{bias, weight} test_class_weight train_class_weight [07/28 11:38:46 d2.engine.train_loop]: Starting training from iteration 0 xFormers not available xFormers not available xFormers not available xFormers not available D:\devit-main\devit-main\tools..\detectron2\structures\boxes.py:158: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at ..\torch\csrc\utils\tensor_new.cpp:233.) tensor = torch.as_tensor(tensor, dtype=torch.float32, device=device) D:\devit-main\devit-main\tools..\detectron2\structures\boxes.py:158: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at ..\torch\csrc\utils\tensor_new.cpp:233.) tensor = torch.as_tensor(tensor, dtype=torch.float32, device=device) D:\devit-main\devit-main\tools..\detectron2\structures\boxes.py:158: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at ..\torch\csrc\utils\tensor_new.cpp:233.) tensor = torch.as_tensor(tensor, dtype=torch.float32, device=device) D:\devit-main\devit-main\tools..\detectron2\structures\boxes.py:158: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at ..\torch\csrc\utils\tensor_new.cpp:233.) tensor = torch.as_tensor(tensor, dtype=torch.float32, device=device) D:\anconda\envs\zhl310\lib\site-packages\torch\functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ..\aten\src\ATen\native\TensorShape.cpp:3191.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined] [07/28 11:39:03 fvcore.common.checkpoint]: Saving checkpoint to D:/devit-main/devit-main/output/train/open-vocabulary/coco/vitl/model_0000019.pth [07/28 11:39:12 d2.data.datasets.coco]: Loading D:\devit-main\devit-main\datasets\coco\annotations\ovd_ins_val2017_all.json takes 8.30 seconds. [07/28 11:39:13 d2.data.datasets.coco]: Loaded 107761 images in COCO format from D:\devit-main\devit-main\datasets\coco\annotations\ovd_ins_val2017_all.json [07/28 11:39:16 d2.data.dataset_mapper]: [DatasetMapper] Augmentations used in inference: [ResizeShortestEdge(short_edge_length=(800, 800), max_size=1333, sample_style='choice')] [07/28 11:39:16 d2.data.common]: Serializing 107761 elements to byte tensors and concatenating them all ... [07/28 11:39:17 d2.data.common]: Serialized dataset takes 361.16 MiB [07/28 11:39:26 d2.evaluation.evaluator]: Start inference on 107761 batches xFormers not available xFormers not available xFormers not available xFormers not available D:\devit-main\devit-main\tools..\detectron2\structures\boxes.py:158: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at ..\torch\csrc\utils\tensor_new.cpp:233.) tensor = torch.as_tensor(tensor, dtype=torch.float32, device=device) D:\devit-main\devit-main\tools..\detectron2\structures\boxes.py:158: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at ..\torch\csrc\utils\tensor_new.cpp:233.) tensor = torch.as_tensor(tensor, dtype=torch.float32, device=device) D:\devit-main\devit-main\tools..\detectron2\structures\boxes.py:158: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at ..\torch\csrc\utils\tensor_new.cpp:233.) tensor = torch.as_tensor(tensor, dtype=torch.float32, device=device) D:\devit-main\devit-main\tools..\detectron2\structures\boxes.py:158: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at ..\torch\csrc\utils\tensor_new.cpp:233.) tensor = torch.as_tensor(tensor, dtype=torch.float32, device=device) [07/28 11:39:48 d2.evaluation.evaluator]: Inference done 11/107761. Dataloading: 0.0004 s / iter. Inference: 1.2682 s / iter. Eval: 0.0028 s / iter. Total: 1.2714 s / iter. ETA=1 day, 14:03:13 [07/28 11:39:53 d2.evaluation.evaluator]: Inference done 16/107761. Dataloading: 0.0004 s / iter. Inference: 1.2074 s / iter. Eval: 0.0016 s / iter. Total: 1.2095 s / iter. ETA=1 day, 12:11:55 [07/28 11:39:59 d2.evaluation.evaluator]: Inference done 22/107761. Dataloading: 0.0005 s / iter. Inference: 1.1208 s / iter. Eval: 0.0011 s / iter. Total: 1.1225 s / iter. ETA=1 day, 9:35:36 [07/28 11:40:06 d2.evaluation.evaluator]: Inference done 29/107761. Dataloading: 0.0005 s / iter. Inference: 1.0598 s / iter. Eval: 0.0009 s / iter. Total: 1.0612 s / iter. ETA=1 day, 7:45:21 [07/28 11:40:11 d2.evaluation.evaluator]: Inference done 35/107761. Dataloading: 0.0005 s / iter. Inference: 1.0298 s / iter. Eval: 0.0007 s / iter. Total: 1.0311 s / iter. ETA=1 day, 6:51:16 [07/28 11:40:17 d2.evaluation.evaluator]: Inference done 41/107761. Dataloading: 0.0005 s / iter. Inference: 1.0333 s / iter. Eval: 0.0006 s / iter. Total: 1.0345 s / iter. ETA=1 day, 6:57:17 [07/28 11:40:23 d2.evaluation.evaluator]: Inference done 47/107761. Dataloading: 0.0005 s / iter. Inference: 1.0315 s / iter. Eval: 0.0006 s / iter. Total: 1.0327 s / iter. ETA=1 day, 6:53:52 [07/28 11:40:28 d2.evaluation.evaluator]: Inference done 53/107761. Dataloading: 0.0005 s / iter. Inference: 1.0068 s / iter. Eval: 0.0005 s / iter. Total: 1.0079 s / iter. ETA=1 day, 6:09:19 [07/28 11:40:34 d2.evaluation.evaluator]: Inference done 58/107761. Dataloading: 0.0005 s / iter. Inference: 1.0218 s / iter. Eval: 0.0005 s / iter. Total: 1.0229 s / iter. ETA=1 day, 6:36:08 ”

mlzxy / devit

error in loading coco train2017 and val2017 #58