Can you provide the code for converting your dataset to COCO format?

2000YWQ commented 5 months ago

I am having some problems converting SAR-AIRcraft 1.0 dataset format to COCO format.Thank you very much.

JoyeZLearning commented 5 months ago

You can share your problems about converting format to coco and I am willing to help you. As for the codes, there are many resources available online to help you to address your problems.

2000YWQ commented 5 months ago

The format of the processed data is the same for the coco dataset and the pascal_voc dataset in detectron2/data/datasets/coco.py and /detectron2/data/datasets/pascal_voc.py,so I've created a soft connection to VOC2007.my yaml is: DATASETS: TRAIN: ("voc_2007_train",) TEST: ("voc_2007_val",) and I've modified detectron2/data/datasets/pascal_voc.py： CLASS_NAMES = ( "A220", "A320/321", "A330", "ARJ21", "Boeing737", "Boeing787", "other" ) Finally It automatically generates the voc_2007_val_coco_format.json file. But my result is bad

there is my log： `[04/10 06:18:32] detectron2 INFO: Full config saved to ./output/config.yaml [04/10 06:18:36] d2.checkpoint.detection_checkpoint INFO: [DetectionCheckpointer] Loading from detectron2://ImageNetPretrained/torchvision/R-50.pkl ... [04/10 06:18:36] fvcore.common.checkpoint INFO: [Checkpointer] Loading from /root/.torch/iopath_cache/detectron2/ImageNetPretrained/torchvision/R-50.pkl ... [04/10 06:18:36] fvcore.common.checkpoint INFO: Reading a file from 'torchvision' [04/10 06:18:36] d2.checkpoint.c2_model_loading INFO: Following weights matched with submodule backbone.bottom_up - Total num: 53 [04/10 06:18:36] fvcore.common.checkpoint WARNING: Some model parameters or buffers are not found in the checkpoint: [34malphas_cumprod[0m [34malphas_cumprod_prev[0m [34mbackbone.fpn_lateral2.{bias, weight}[0m [34mbackbone.fpn_lateral3.{bias, weight}[0m [34mbackbone.fpn_lateral4.{bias, weight}[0m [34mbackbone.fpn_lateral5.{bias, weight}[0m [34mbackbone.fpn_output2.{bias, weight}[0m [34mbackbone.fpn_output3.{bias, weight}[0m [34mbackbone.fpn_output4.{bias, weight}[0m [34mbackbone.fpn_output5.{bias, weight}[0m [34mbetas[0m [34mdiff_conv5.weight[0m [34mhead.head_series.0.bboxes_delta.{bias, weight}[0m [34mhead.head_series.0.block_time_mlp.1.{bias, weight}[0m [34mhead.head_series.0.class_logits.{bias, weight}[0m [34mhead.head_series.0.cls_module.0.weight[0m [34mhead.head_series.0.cls_module.1.{bias, weight}[0m [34mhead.head_series.0.inst_interact.dynamic_layer.{bias, weight}[0m [34mhead.head_series.0.inst_interact.norm1.{bias, weight}[0m [34mhead.head_series.0.inst_interact.norm2.{bias, weight}[0m [34mhead.head_series.0.inst_interact.norm3.{bias, weight}[0m [34mhead.head_series.0.inst_interact.out_layer.{bias, weight}[0m [34mhead.head_series.0.linear1.{bias, weight}[0m [34mhead.head_series.0.linear2.{bias, weight}[0m [34mhead.head_series.0.norm1.{bias, weight}[0m [34mhead.head_series.0.norm2.{bias, weight}[0m [34mhead.head_series.0.norm3.{bias, weight}[0m [34mhead.head_series.0.reg_module.0.weight[0m [34mhead.head_series.0.reg_module.1.{bias, weight}[0m [34mhead.head_series.0.reg_module.3.weight[0m [34mhead.head_series.0.reg_module.4.{bias, weight}[0m [34mhead.head_series.0.reg_module.6.weight[0m [34mhead.head_series.0.reg_module.7.{bias, weight}[0m [34mhead.head_series.0.self_attn.out_proj.{bias, weight}[0m [34mhead.head_series.0.self_attn.{in_proj_bias, in_proj_weight}[0m [34mhead.head_series.1.bboxes_delta.{bias, weight}[0m [34mhead.head_series.1.block_time_mlp.1.{bias, weight}[0m [34mhead.head_series.1.class_logits.{bias, weight}[0m [34mhead.head_series.1.cls_module.0.weight[0m [34mhead.head_series.1.cls_module.1.{bias, weight}[0m [34mhead.head_series.1.inst_interact.dynamic_layer.{bias, weight}[0m [34mhead.head_series.1.inst_interact.norm1.{bias, weight}[0m [34mhead.head_series.1.inst_interact.norm2.{bias, weight}[0m [34mhead.head_series.1.inst_interact.norm3.{bias, weight}[0m [34mhead.head_series.1.inst_interact.out_layer.{bias, weight}[0m [34mhead.head_series.1.linear1.{bias, weight}[0m [34mhead.head_series.1.linear2.{bias, weight}[0m [34mhead.head_series.1.norm1.{bias, weight}[0m [34mhead.head_series.1.norm2.{bias, weight}[0m [34mhead.head_series.1.norm3.{bias, weight}[0m [34mhead.head_series.1.reg_module.0.weight[0m [34mhead.head_series.1.reg_module.1.{bias, weight}[0m [34mhead.head_series.1.reg_module.3.weight[0m [34mhead.head_series.1.reg_module.4.{bias, weight}[0m [34mhead.head_series.1.reg_module.6.weight[0m [34mhead.head_series.1.reg_module.7.{bias, weight}[0m [34mhead.head_series.1.self_attn.out_proj.{bias, weight}[0m [34mhead.head_series.1.self_attn.{in_proj_bias, in_proj_weight}[0m [34mhead.head_series.2.bboxes_delta.{bias, weight}[0m [34mhead.head_series.2.block_time_mlp.1.{bias, weight}[0m [34mhead.head_series.2.class_logits.{bias, weight}[0m [34mhead.head_series.2.cls_module.0.weight[0m [34mhead.head_series.2.cls_module.1.{bias, weight}[0m [34mhead.head_series.2.inst_interact.dynamic_layer.{bias, weight}[0m [34mhead.head_series.2.inst_interact.norm1.{bias, weight}[0m [34mhead.head_series.2.inst_interact.norm2.{bias, weight}[0m [34mhead.head_series.2.inst_interact.norm3.{bias, weight}[0m [34mhead.head_series.2.inst_interact.out_layer.{bias, weight}[0m [34mhead.head_series.2.linear1.{bias, weight}[0m [34mhead.head_series.2.linear2.{bias, weight}[0m [34mhead.head_series.2.norm1.{bias, weight}[0m [34mhead.head_series.2.norm2.{bias, weight}[0m [34mhead.head_series.2.norm3.{bias, weight}[0m [34mhead.head_series.2.reg_module.0.weight[0m [34mhead.head_series.2.reg_module.1.{bias, weight}[0m [34mhead.head_series.2.reg_module.3.weight[0m [34mhead.head_series.2.reg_module.4.{bias, weight}[0m [34mhead.head_series.2.reg_module.6.weight[0m [34mhead.head_series.2.reg_module.7.{bias, weight}[0m [34mhead.head_series.2.self_attn.out_proj.{bias, weight}[0m [34mhead.head_series.2.self_attn.{in_proj_bias, in_proj_weight}[0m [34mhead.head_series.3.bboxes_delta.{bias, weight}[0m [34mhead.head_series.3.block_time_mlp.1.{bias, weight}[0m [34mhead.head_series.3.class_logits.{bias, weight}[0m [34mhead.head_series.3.cls_module.0.weight[0m [34mhead.head_series.3.cls_module.1.{bias, weight}[0m [34mhead.head_series.3.inst_interact.dynamic_layer.{bias, weight}[0m [34mhead.head_series.3.inst_interact.norm1.{bias, weight}[0m [34mhead.head_series.3.inst_interact.norm2.{bias, weight}[0m [34mhead.head_series.3.inst_interact.norm3.{bias, weight}[0m [34mhead.head_series.3.inst_interact.out_layer.{bias, weight}[0m [34mhead.head_series.3.linear1.{bias, weight}[0m [34mhead.head_series.3.linear2.{bias, weight}[0m [34mhead.head_series.3.norm1.{bias, weight}[0m [34mhead.head_series.3.norm2.{bias, weight}[0m [34mhead.head_series.3.norm3.{bias, weight}[0m [34mhead.head_series.3.reg_module.0.weight[0m [34mhead.head_series.3.reg_module.1.{bias, weight}[0m [34mhead.head_series.3.reg_module.3.weight[0m [34mhead.head_series.3.reg_module.4.{bias, weight}[0m [34mhead.head_series.3.reg_module.6.weight[0m [34mhead.head_series.3.reg_module.7.{bias, weight}[0m [34mhead.head_series.3.self_attn.out_proj.{bias, weight}[0m [34mhead.head_series.3.self_attn.{in_proj_bias, in_proj_weight}[0m [34mhead.head_series.4.bboxes_delta.{bias, weight}[0m [34mhead.head_series.4.block_time_mlp.1.{bias, weight}[0m [34mhead.head_series.4.class_logits.{bias, weight}[0m [34mhead.head_series.4.cls_module.0.weight[0m [34mhead.head_series.4.cls_module.1.{bias, weight}[0m [34mhead.head_series.4.inst_interact.dynamic_layer.{bias, weight}[0m [34mhead.head_series.4.inst_interact.norm1.{bias, weight}[0m [34mhead.head_series.4.inst_interact.norm2.{bias, weight}[0m [34mhead.head_series.4.inst_interact.norm3.{bias, weight}[0m [34mhead.head_series.4.inst_interact.out_layer.{bias, weight}[0m [34mhead.head_series.4.linear1.{bias, weight}[0m [34mhead.head_series.4.linear2.{bias, weight}[0m [34mhead.head_series.4.norm1.{bias, weight}[0m [34mhead.head_series.4.norm2.{bias, weight}[0m [34mhead.head_series.4.norm3.{bias, weight}[0m [34mhead.head_series.4.reg_module.0.weight[0m [34mhead.head_series.4.reg_module.1.{bias, weight}[0m [34mhead.head_series.4.reg_module.3.weight[0m [34mhead.head_series.4.reg_module.4.{bias, weight}[0m [34mhead.head_series.4.reg_module.6.weight[0m [34mhead.head_series.4.reg_module.7.{bias, weight}[0m [34mhead.head_series.4.self_attn.out_proj.{bias, weight}[0m [34mhead.head_series.4.self_attn.{in_proj_bias, in_proj_weight}[0m [34mhead.head_series.5.bboxes_delta.{bias, weight}[0m [34mhead.head_series.5.block_time_mlp.1.{bias, weight}[0m [34mhead.head_series.5.class_logits.{bias, weight}[0m [34mhead.head_series.5.cls_module.0.weight[0m [34mhead.head_series.5.cls_module.1.{bias, weight}[0m [34mhead.head_series.5.inst_interact.dynamic_layer.{bias, weight}[0m [34mhead.head_series.5.inst_interact.norm1.{bias, weight}[0m [34mhead.head_series.5.inst_interact.norm2.{bias, weight}[0m [34mhead.head_series.5.inst_interact.norm3.{bias, weight}[0m [34mhead.head_series.5.inst_interact.out_layer.{bias, weight}[0m [34mhead.head_series.5.linear1.{bias, weight}[0m [34mhead.head_series.5.linear2.{bias, weight}[0m [34mhead.head_series.5.norm1.{bias, weight}[0m [34mhead.head_series.5.norm2.{bias, weight}[0m [34mhead.head_series.5.norm3.{bias, weight}[0m [34mhead.head_series.5.reg_module.0.weight[0m [34mhead.head_series.5.reg_module.1.{bias, weight}[0m [34mhead.head_series.5.reg_module.3.weight[0m [34mhead.head_series.5.reg_module.4.{bias, weight}[0m [34mhead.head_series.5.reg_module.6.weight[0m [34mhead.head_series.5.reg_module.7.{bias, weight}[0m [34mhead.head_series.5.self_attn.out_proj.{bias, weight}[0m [34mhead.head_series.5.self_attn.{in_proj_bias, in_proj_weight}[0m [34mhead.time_mlp.1.{bias, weight}[0m [34mhead.time_mlp.3.{bias, weight}[0m [34mlog_one_minus_alphas_cumprod[0m [34mposterior_log_variance_clipped[0m [34mposterior_mean_coef1[0m [34mposterior_mean_coef2[0m [34mposterior_variance[0m [34msqrt_alphas_cumprod[0m [34msqrt_one_minus_alphas_cumprod[0m [34msqrt_recip_alphas_cumprod[0m [34msqrt_recipm1_alphas_cumprod[0m [04/10 06:18:36] fvcore.common.checkpoint WARNING: The checkpoint state_dict contains keys that are not used by the model: [35mstem.fc.{bias, weight}[0m [04/10 06:18:36] d2.data.build INFO: Distribution of instances among all 7 categories: [36m	category	#instances	category	#instances	category	#instances
A220	247	A320/321	82	A330	27
ARJ21	378	Boeing737	252	Boeing787	261
other	715
total	1962					[0m

[04/10 06:18:36] d2.data.dataset_mapper INFO: [DatasetMapper] Augmentations used in inference: [ResizeShortestEdge(short_edge_length=(800, 800), max_size=1800, sample_style='choice')] [04/10 06:18:36] d2.data.common INFO: Serializing the dataset using: <class 'detectron2.data.common._TorchSerializedList'> [04/10 06:18:36] d2.data.common INFO: Serializing 442 elements to byte tensors and concatenating them all ... [04/10 06:18:36] d2.data.common INFO: Serialized dataset takes 0.23 MiB [04/10 06:18:36] d2.evaluation.coco_evaluation INFO: Fast COCO eval is not built. Falling back to official COCO eval. [04/10 06:18:36] d2.evaluation.coco_evaluation WARNING: COCO Evaluator instantiated using config, this is deprecated behavior. Please pass in explicit arguments instead. [04/10 06:18:36] d2.evaluation.coco_evaluation INFO: Trying to convert 'voc_2007_val' to COCO format ... [04/10 06:18:36] d2.data.datasets.coco WARNING: Using previously cached COCO format annotations at './output/inference/voc_2007_val_coco_format.json'. You need to clear the cache file if your dataset has been modified. [04/10 06:18:36] d2.evaluation.evaluator INFO: Start inference on 442 batches [04/10 06:18:37] d2.evaluation.evaluator INFO: Inference done 11/442. Dataloading: 0.0007 s/iter. Inference: 0.0795 s/iter. Eval: 0.0003 s/iter. Total: 0.0805 s/iter. ETA=0:00:34 [04/10 06:18:42] d2.evaluation.evaluator INFO: Inference done 73/442. Dataloading: 0.0011 s/iter. Inference: 0.0792 s/iter. Eval: 0.0003 s/iter. Total: 0.0807 s/iter. ETA=0:00:29 [04/10 06:18:47] d2.evaluation.evaluator INFO: Inference done 133/442. Dataloading: 0.0011 s/iter. Inference: 0.0804 s/iter. Eval: 0.0003 s/iter. Total: 0.0820 s/iter. ETA=0:00:25 [04/10 06:18:53] d2.evaluation.evaluator INFO: Inference done 196/442. Dataloading: 0.0012 s/iter. Inference: 0.0799 s/iter. Eval: 0.0003 s/iter. Total: 0.0815 s/iter. ETA=0:00:20 [04/10 06:18:58] d2.evaluation.evaluator INFO: Inference done 258/442. Dataloading: 0.0012 s/iter. Inference: 0.0798 s/iter. Eval: 0.0003 s/iter. Total: 0.0814 s/iter. ETA=0:00:14 [04/10 06:19:03] d2.evaluation.evaluator INFO: Inference done 321/442. Dataloading: 0.0012 s/iter. Inference: 0.0796 s/iter. Eval: 0.0003 s/iter. Total: 0.0812 s/iter. ETA=0:00:09 [04/10 06:19:08] d2.evaluation.evaluator INFO: Inference done 384/442. Dataloading: 0.0012 s/iter. Inference: 0.0795 s/iter. Eval: 0.0003 s/iter. Total: 0.0811 s/iter. ETA=0:00:04 [04/10 06:19:12] d2.evaluation.evaluator INFO: Total inference time: 0:00:35.438219 (0.081094 s / iter per device, on 1 devices) [04/10 06:19:12] d2.evaluation.evaluator INFO: Total inference pure compute time: 0:00:34 (0.079390 s / iter per device, on 1 devices) [04/10 06:19:13] d2.evaluation.coco_evaluation INFO: Preparing results for COCO format ... [04/10 06:19:13] d2.evaluation.coco_evaluation INFO: Saving results to ./output/inference/coco_eval_instances_results.json [04/10 06:19:13] d2.evaluation.coco_evaluation INFO: Evaluating predictions with official COCO API... [04/10 06:19:17] d2.evaluation.coco_evaluation INFO: Evaluation results for bbox:	AP	AP50	AP75	APs	APm	APl
0.000	0.000	0.000	nan	0.000	0.000

[04/10 06:19:17] d2.evaluation.coco_evaluation INFO: Some metrics cannot be computed and is shown as NaN. [04/10 06:19:17] d2.evaluation.coco_evaluation INFO: Per-category bbox AP:	AP	category	AP	category
A220	A320/321	0.000	A330	0.000
ARJ21	Boeing737	0.000	Boeing787	0.000
other

[04/10 06:19:17] d2.engine.defaults INFO: Evaluation results for voc_2007_val in csv format: [04/10 06:19:17] d2.evaluation.testing INFO: copypaste: Task: bbox [04/10 06:19:17] d2.evaluation.testing INFO: copypaste: AP,AP50,AP75,APs,APm,APl [04/10 06:19:17] d2.evaluation.testing INFO: copypaste: 0.0000,0.0000,0.0000,nan,0.0000,0.0000 [04/10 06:22:43] detectron2 INFO: Rank of current process: 0. World size: 1 [04/10 06:22:44] detectron2 INFO: Environment info:

sys.platform linux Python 3.9.19 (main, Mar 21 2024, 17:11:28) [GCC 11.2.0] numpy 1.26.4 detectron2 0.6 @/workspace/DiffDet4SAR/detectron2 detectron2._C not built correctly: No module named 'detectron2._C' Compiler ($CXX) c++ (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 DETECTRON2_ENV_MODULE PyTorch 1.11.0 @/opt/conda/envs/diffusion/lib/python3.9/site-packages/torch PyTorch debug build False torch._C._GLIBCXX_USE_CXX11_ABI False GPU available Yes GPU 0 NVIDIA TITAN Xp (arch=6.1) Driver version 470.182.03 CUDA_HOME None - invalid! Pillow 10.2.0 torchvision 0.12.0 @/opt/conda/envs/diffusion/lib/python3.9/site-packages/torchvision torchvision arch flags /opt/conda/envs/diffusion/lib/python3.9/site-packages/torchvision/_C.so fvcore 0.1.6 iopath 0.1.9 cv2 4.9.0

PyTorch built with:

GCC 7.3
C++ Version: 201402
Intel(R) oneAPI Math Kernel Library Version 2023.1-Product Build 20230303 for Intel(R) 64 architecture applications
Intel(R) MKL-DNN v2.5.2 (Git Hash a9302535553c73243c632ad3c4c80beec3d19a1e)
OpenMP 201511 (a.k.a. OpenMP 4.5)
LAPACK is enabled (usually provided by MKL)
NNPACK is enabled
CPU capability usage: AVX2
CUDA Runtime 11.3
NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
CuDNN 8.2
Magma 2.5.2
Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.2.0, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.11.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,

[04/10 06:22:44] detectron2 INFO: Command line arguments: Namespace(config_file='configs/diffdet.voc.res50.yaml', resume=False, eval_only=True, num_gpus=1, num_machines=1, machine_rank=0, dist_url='tcp://127.0.0.1:49152', opts=[]) [04/10 06:22:44] detectron2 INFO: Contents of args.config_file=configs/diffdet.voc.res50.yaml: BASE: "Base-DiffusionDet.yaml" MODEL: WEIGHTS: "detectron2://ImageNetPretrained/torchvision/R-50.pkl" RESNETS: DEPTH: 50 STRIDE_IN_1X1: False DiffusionDet: NUM_PROPOSALS: 500 NUM_CLASSES: 7 DATASETS: TRAIN: ("voc_2007_train",) TEST: ("voc_2007_val",) SOLVER: STEPS: (350000, 420000) MAX_ITER: 450000 INPUT: CROP: ENABLED: True FORMAT: "RGB"

[04/10 06:22:44] detectron2 INFO: Running with full config: CUDNN_BENCHMARK: false DATALOADER: ASPECT_RATIO_GROUPING: true FILTER_EMPTY_ANNOTATIONS: false NUM_WORKERS: 4 REPEAT_THRESHOLD: 0.0 SAMPLER_TRAIN: TrainingSampler DATASETS: PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000 PRECOMPUTED_PROPOSAL_TOPK_TRAIN: 2000 PROPOSAL_FILES_TEST: [] PROPOSAL_FILES_TRAIN: [] TEST:

voc_2007_val TRAIN:
voc_2007_train GLOBAL: HACK: 1.0 INPUT: CROP: ENABLED: true SIZE:
- 0.9
- 0.9 TYPE: absolute_range FORMAT: RGB MASK_FORMAT: polygon MAX_SIZE_TEST: 1800 MAX_SIZE_TRAIN: 1800 MIN_SIZE_TEST: 800 MIN_SIZE_TRAIN:
800
1000
1200
1500 MIN_SIZE_TRAIN_SAMPLING: choice RANDOM_FLIP: horizontal MODEL: ANCHOR_GENERATOR: ANGLES:
- - -90
  - 0
  - 90 ASPECT_RATIOS:
- - 0.5
  - 1.0
  - 2.0 NAME: DefaultAnchorGenerator OFFSET: 0.0 SIZES:
- - 32
  - 64
  - 128
  - 256
  - 512 BACKBONE: FREEZE_AT: 2 NAME: build_resnet_fpn_backbone DEVICE: cuda DiffusionDet: ACTIVATION: relu ALPHA: 0.25 CLASS_WEIGHT: 2.0 DEEP_SUPERVISION: true DIM_DYNAMIC: 64 DIM_FEEDFORWARD: 2048 DROPOUT: 0.0 GAMMA: 2.0 GIOU_WEIGHT: 2.0 HIDDEN_DIM: 256 L1_WEIGHT: 5.0 NHEADS: 8 NO_OBJECT_WEIGHT: 0.1 NUM_CLASSES: 7 NUM_CLS: 1 NUM_DYNAMIC: 2 NUM_HEADS: 6 NUM_PROPOSALS: 500 NUM_REG: 3 OTA_K: 5 PRIOR_PROB: 0.01 SAMPLE_STEP: 1 SNR_SCALE: 2.0 USE_FED_LOSS: false USE_FOCAL: true USE_NMS: true FPN: FUSE_TYPE: sum IN_FEATURES:
- res2
- res3
- res4
- res5 NORM: '' OUT_CHANNELS: 256 KEYPOINT_ON: false LOAD_PROPOSALS: false MASK_ON: false META_ARCHITECTURE: DiffusionDet PANOPTIC_FPN: COMBINE: ENABLED: true INSTANCES_CONFIDENCE_THRESH: 0.5 OVERLAP_THRESH: 0.5 STUFF_AREA_LIMIT: 4096 INSTANCE_LOSS_WEIGHT: 1.0 PIXEL_MEAN:
23.7354
23.7354
23.7354 PIXEL_STD:
27.256
27.256
27.256 PROPOSAL_GENERATOR: MIN_SIZE: 0 NAME: RPN RESNETS: DEFORM_MODULATED: false DEFORM_NUM_GROUPS: 1 DEFORM_ON_PER_STAGE:
- false
- false
- false
- false DEPTH: 50 NORM: FrozenBN NUM_GROUPS: 1 OUT_FEATURES:
- res2
- res3
- res4
- res5 RES2_OUT_CHANNELS: 256 RES5_DILATION: 1 STEM_OUT_CHANNELS: 64 STRIDE_IN_1X1: false WIDTH_PER_GROUP: 64 RETINANET: BBOX_REG_LOSS_TYPE: smooth_l1 BBOX_REG_WEIGHTS: &id002
- 1.0
- 1.0
- 1.0
- 1.0 FOCAL_LOSS_ALPHA: 0.25 FOCAL_LOSS_GAMMA: 2.0 IN_FEATURES:
- p3
- p4
- p5
- p6
- p7 IOU_LABELS:
- 0
- -1
- 1 IOU_THRESHOLDS:
- 0.4
- 0.5 NMS_THRESH_TEST: 0.5 NORM: '' NUM_CLASSES: 7 NUM_CONVS: 4 PRIOR_PROB: 0.01 SCORE_THRESH_TEST: 0.05 SMOOTH_L1_LOSS_BETA: 0.1 TOPK_CANDIDATES_TEST: 1000 ROI_BOX_CASCADE_HEAD: BBOX_REG_WEIGHTS:
- &id001
  - 10.0
  - 10.0
  - 5.0
  - 5.0
- - 20.0
  - 20.0
  - 10.0
  - 10.0
- - 30.0
  - 30.0
  - 15.0
  - 15.0 IOUS:
- 0.5
- 0.6
- 0.7 ROI_BOX_HEAD: BBOX_REG_LOSS_TYPE: smooth_l1 BBOX_REG_LOSS_WEIGHT: 1.0 BBOX_REG_WEIGHTS: *id001 CLS_AGNOSTIC_BBOX_REG: false CONV_DIM: 256 FC_DIM: 1024 FED_LOSS_FREQ_WEIGHT_POWER: 0.5 FED_LOSS_NUM_CLASSES: 50 NAME: '' NORM: '' NUM_CONV: 0 NUM_FC: 0 POOLER_RESOLUTION: 7 POOLER_SAMPLING_RATIO: 2 POOLER_TYPE: ROIAlignV2 SMOOTH_L1_BETA: 0.0 TRAIN_ON_PRED_BOXES: false USE_FED_LOSS: false USE_SIGMOID_CE: false ROI_HEADS: BATCH_SIZE_PER_IMAGE: 512 IN_FEATURES:
- p2
- p3
- p4
- p5 IOU_LABELS:
- 0
- 1 IOU_THRESHOLDS:
- 0.5 NAME: Res5ROIHeads NMS_THRESH_TEST: 0.5 NUM_CLASSES: 7 POSITIVE_FRACTION: 0.25 PROPOSAL_APPEND_GT: true SCORE_THRESH_TEST: 0.05 ROI_KEYPOINT_HEAD: CONV_DIMS:
- 512
- 512
- 512
- 512
- 512
- 512
- 512
- 512 LOSS_WEIGHT: 1.0 MIN_KEYPOINTS_PER_IMAGE: 1 NAME: KRCNNConvDeconvUpsampleHead NORMALIZE_LOSS_BY_VISIBLE_KEYPOINTS: true NUM_KEYPOINTS: 17 POOLER_RESOLUTION: 14 POOLER_SAMPLING_RATIO: 0 POOLER_TYPE: ROIAlignV2 ROI_MASK_HEAD: CLS_AGNOSTIC_MASK: false CONV_DIM: 256 NAME: MaskRCNNConvUpsampleHead NORM: '' NUM_CONV: 0 POOLER_RESOLUTION: 14 POOLER_SAMPLING_RATIO: 0 POOLER_TYPE: ROIAlignV2 RPN: BATCH_SIZE_PER_IMAGE: 256 BBOX_REG_LOSS_TYPE: smooth_l1 BBOX_REG_LOSS_WEIGHT: 1.0 BBOX_REG_WEIGHTS: *id002 BOUNDARY_THRESH: -1 CONV_DIMS:
- -1 HEAD_NAME: StandardRPNHead IN_FEATURES:
- res4 IOU_LABELS:
- 0
- -1
- 1 IOU_THRESHOLDS:
- 0.3
- 0.7 LOSS_WEIGHT: 1.0 NMS_THRESH: 0.7 POSITIVE_FRACTION: 0.5 POST_NMS_TOPK_TEST: 1000 POST_NMS_TOPK_TRAIN: 2000 PRE_NMS_TOPK_TEST: 6000 PRE_NMS_TOPK_TRAIN: 12000 SMOOTH_L1_BETA: 0.0 SEM_SEG_HEAD: COMMON_STRIDE: 4 CONVS_DIM: 128 IGNORE_VALUE: 255 IN_FEATURES:
- p2
- p3
- p4
- p5 LOSS_WEIGHT: 1.0 NAME: SemSegFPNHead NORM: GN NUM_CLASSES: 54 SWIN: OUT_FEATURES:
- 0
- 1
- 2
- 3 SIZE: B USE_CHECKPOINT: false WEIGHTS: detectron2://ImageNetPretrained/torchvision/R-50.pkl MODEL_EMA: DECAY: 0.999 DEVICE: '' ENABLED: false USE_EMA_WEIGHTS_FOR_EVAL_ONLY: false YOLOX: false OUTPUT_DIR: ./output SEED: 40244023 SOLVER: AMP: ENABLED: false BACKBONE_MULTIPLIER: 1.0 BASE_LR: 5.0e-06 BASE_LR_END: 0.0 BIAS_LR_FACTOR: 1.0 CHECKPOINT_PERIOD: 5000 CLIP_GRADIENTS: CLIP_TYPE: full_model CLIP_VALUE: 1.0 ENABLED: true NORM_TYPE: 2.0 GAMMA: 0.1 IMS_PER_BATCH: 4 LR_SCHEDULER_NAME: WarmupMultiStepLR MAX_ITER: 450000 MOMENTUM: 0.9 NESTEROV: false NUM_DECAYS: 3 OPTIMIZER: ADAMW REFERENCE_WORLD_SIZE: 0 RESCALE_INTERVAL: false STEPS:
350000
420000 WARMUP_FACTOR: 0.01 WARMUP_ITERS: 1000 WARMUP_METHOD: linear WEIGHT_DECAY: 0.0001 WEIGHT_DECAY_BIAS: null WEIGHT_DECAY_NORM: 0.0 TEST: AUG: CVPODS_TTA: true ENABLED: false FLIP: true MAX_SIZE: 4000 MIN_SIZES:
- 400
- 500
- 600
- 640
- 700
- 900
- 1000
- 1100
- 1200
- 1300
- 1400
- 1500
- 1800
- 800 SCALE_FILTER: true SCALE_RANGES:
- - 96
  - 10000
- - 96
  - 10000
- - 64
  - 10000
- - 64
  - 10000
- - 64
  - 10000
- - 0
  - 10000
- - 0
  - 10000
- - 0
  - 256
- - 0
  - 256
- - 0
  - 192
- - 0
  - 192
- - 0
  - 96
- - 0
  - 10000 DETECTIONS_PER_IMAGE: 100 EVAL_PERIOD: 3000 EXPECTED_RESULTS: [] KEYPOINT_OKS_SIGMAS: [] PRECISE_BN: ENABLED: false NUM_ITER: 200 VERSION: 2 VIS_PERIOD: 0

[04/10 06:22:44] detectron2 INFO: Full config saved to ./output/config.yaml [04/10 06:22:48] d2.checkpoint.detection_checkpoint INFO: [DetectionCheckpointer] Loading from detectron2://ImageNetPretrained/torchvision/R-50.pkl ... [04/10 06:22:48] fvcore.common.checkpoint INFO: [Checkpointer] Loading from /root/.torch/iopath_cache/detectron2/ImageNetPretrained/torchvision/R-50.pkl ... [04/10 06:22:48] fvcore.common.checkpoint INFO: Reading a file from 'torchvision' [04/10 06:22:48] d2.checkpoint.c2_model_loading INFO: Following weights matched with submodule backbone.bottom_up - Total num: 53 [04/10 06:22:48] fvcore.common.checkpoint WARNING: Some model parameters or buffers are not found in the checkpoint: [34malphas_cumprod[0m [34malphas_cumprod_prev[0m [34mbackbone.fpn_lateral2.{bias, weight}[0m [34mbackbone.fpn_lateral3.{bias, weight}[0m [34mbackbone.fpn_lateral4.{bias, weight}[0m [34mbackbone.fpn_lateral5.{bias, weight}[0m [34mbackbone.fpn_output2.{bias, weight}[0m [34mbackbone.fpn_output3.{bias, weight}[0m [34mbackbone.fpn_output4.{bias, weight}[0m [34mbackbone.fpn_output5.{bias, weight}[0m [34mbetas[0m [34mdiff_conv5.weight[0m [34mhead.head_series.0.bboxes_delta.{bias, weight}[0m [34mhead.head_series.0.block_time_mlp.1.{bias, weight}[0m [34mhead.head_series.0.class_logits.{bias, weight}[0m [34mhead.head_series.0.cls_module.0.weight[0m [34mhead.head_series.0.cls_module.1.{bias, weight}[0m [34mhead.head_series.0.inst_interact.dynamic_layer.{bias, weight}[0m [34mhead.head_series.0.inst_interact.norm1.{bias, weight}[0m [34mhead.head_series.0.inst_interact.norm2.{bias, weight}[0m [34mhead.head_series.0.inst_interact.norm3.{bias, weight}[0m [34mhead.head_series.0.inst_interact.out_layer.{bias, weight}[0m [34mhead.head_series.0.linear1.{bias, weight}[0m [34mhead.head_series.0.linear2.{bias, weight}[0m [34mhead.head_series.0.norm1.{bias, weight}[0m [34mhead.head_series.0.norm2.{bias, weight}[0m [34mhead.head_series.0.norm3.{bias, weight}[0m [34mhead.head_series.0.reg_module.0.weight[0m [34mhead.head_series.0.reg_module.1.{bias, weight}[0m [34mhead.head_series.0.reg_module.3.weight[0m [34mhead.head_series.0.reg_module.4.{bias, weight}[0m [34mhead.head_series.0.reg_module.6.weight[0m [34mhead.head_series.0.reg_module.7.{bias, weight}[0m [34mhead.head_series.0.self_attn.out_proj.{bias, weight}[0m [34mhead.head_series.0.self_attn.{in_proj_bias, in_proj_weight}[0m [34mhead.head_series.1.bboxes_delta.{bias, weight}[0m [34mhead.head_series.1.block_time_mlp.1.{bias, weight}[0m [34mhead.head_series.1.class_logits.{bias, weight}[0m [34mhead.head_series.1.cls_module.0.weight[0m [34mhead.head_series.1.cls_module.1.{bias, weight}[0m [34mhead.head_series.1.inst_interact.dynamic_layer.{bias, weight}[0m [34mhead.head_series.1.inst_interact.norm1.{bias, weight}[0m [34mhead.head_series.1.inst_interact.norm2.{bias, weight}[0m [34mhead.head_series.1.inst_interact.norm3.{bias, weight}[0m [34mhead.head_series.1.inst_interact.out_layer.{bias, weight}[0m [34mhead.head_series.1.linear1.{bias, weight}[0m [34mhead.head_series.1.linear2.{bias, weight}[0m [34mhead.head_series.1.norm1.{bias, weight}[0m [34mhead.head_series.1.norm2.{bias, weight}[0m [34mhead.head_series.1.norm3.{bias, weight}[0m [34mhead.head_series.1.reg_module.0.weight[0m [34mhead.head_series.1.reg_module.1.{bias, weight}[0m [34mhead.head_series.1.reg_module.3.weight[0m [34mhead.head_series.1.reg_module.4.{bias, weight}[0m [34mhead.head_series.1.reg_module.6.weight[0m [34mhead.head_series.1.reg_module.7.{bias, weight}[0m [34mhead.head_series.1.self_attn.out_proj.{bias, weight}[0m [34mhead.head_series.1.self_attn.{in_proj_bias, in_proj_weight}[0m [34mhead.head_series.2.bboxes_delta.{bias, weight}[0m [34mhead.head_series.2.block_time_mlp.1.{bias, weight}[0m [34mhead.head_series.2.class_logits.{bias, weight}[0m [34mhead.head_series.2.cls_module.0.weight[0m [34mhead.head_series.2.cls_module.1.{bias, weight}[0m [34mhead.head_series.2.inst_interact.dynamic_layer.{bias, weight}[0m [34mhead.head_series.2.inst_interact.norm1.{bias, weight}[0m [34mhead.head_series.2.inst_interact.norm2.{bias, weight}[0m [34mhead.head_series.2.inst_interact.norm3.{bias, weight}[0m [34mhead.head_series.2.inst_interact.out_layer.{bias, weight}[0m [34mhead.head_series.2.linear1.{bias, weight}[0m [34mhead.head_series.2.linear2.{bias, weight}[0m [34mhead.head_series.2.norm1.{bias, weight}[0m [34mhead.head_series.2.norm2.{bias, weight}[0m [34mhead.head_series.2.norm3.{bias, weight}[0m [34mhead.head_series.2.reg_module.0.weight[0m [34mhead.head_series.2.reg_module.1.{bias, weight}[0m [34mhead.head_series.2.reg_module.3.weight[0m [34mhead.head_series.2.reg_module.4.{bias, weight}[0m [34mhead.head_series.2.reg_module.6.weight[0m [34mhead.head_series.2.reg_module.7.{bias, weight}[0m [34mhead.head_series.2.self_attn.out_proj.{bias, weight}[0m [34mhead.head_series.2.self_attn.{in_proj_bias, in_proj_weight}[0m [34mhead.head_series.3.bboxes_delta.{bias, weight}[0m [34mhead.head_series.3.block_time_mlp.1.{bias, weight}[0m [34mhead.head_series.3.class_logits.{bias, weight}[0m [34mhead.head_series.3.cls_module.0.weight[0m [34mhead.head_series.3.cls_module.1.{bias, weight}[0m [34mhead.head_series.3.inst_interact.dynamic_layer.{bias, weight}[0m [34mhead.head_series.3.inst_interact.norm1.{bias, weight}[0m [34mhead.head_series.3.inst_interact.norm2.{bias, weight}[0m [34mhead.head_series.3.inst_interact.norm3.{bias, weight}[0m [34mhead.head_series.3.inst_interact.out_layer.{bias, weight}[0m [34mhead.head_series.3.linear1.{bias, weight}[0m [34mhead.head_series.3.linear2.{bias, weight}[0m [34mhead.head_series.3.norm1.{bias, weight}[0m [34mhead.head_series.3.norm2.{bias, weight}[0m [34mhead.head_series.3.norm3.{bias, weight}[0m [34mhead.head_series.3.reg_module.0.weight[0m [34mhead.head_series.3.reg_module.1.{bias, weight}[0m [34mhead.head_series.3.reg_module.3.weight[0m [34mhead.head_series.3.reg_module.4.{bias, weight}[0m [34mhead.head_series.3.reg_module.6.weight[0m [34mhead.head_series.3.reg_module.7.{bias, weight}[0m [34mhead.head_series.3.self_attn.out_proj.{bias, weight}[0m [34mhead.head_series.3.self_attn.{in_proj_bias, in_proj_weight}[0m [34mhead.head_series.4.bboxes_delta.{bias, weight}[0m [34mhead.head_series.4.block_time_mlp.1.{bias, weight}[0m [34mhead.head_series.4.class_logits.{bias, weight}[0m [34mhead.head_series.4.cls_module.0.weight[0m [34mhead.head_series.4.cls_module.1.{bias, weight}[0m [34mhead.head_series.4.inst_interact.dynamic_layer.{bias, weight}[0m [34mhead.head_series.4.inst_interact.norm1.{bias, weight}[0m [34mhead.head_series.4.inst_interact.norm2.{bias, weight}[0m [34mhead.head_series.4.inst_interact.norm3.{bias, weight}[0m [34mhead.head_series.4.inst_interact.out_layer.{bias, weight}[0m [34mhead.head_series.4.linear1.{bias, weight}[0m [34mhead.head_series.4.linear2.{bias, weight}[0m [34mhead.head_series.4.norm1.{bias, weight}[0m [34mhead.head_series.4.norm2.{bias, weight}[0m [34mhead.head_series.4.norm3.{bias, weight}[0m [34mhead.head_series.4.reg_module.0.weight[0m [34mhead.head_series.4.reg_module.1.{bias, weight}[0m [34mhead.head_series.4.reg_module.3.weight[0m [34mhead.head_series.4.reg_module.4.{bias, weight}[0m [34mhead.head_series.4.reg_module.6.weight[0m [34mhead.head_series.4.reg_module.7.{bias, weight}[0m [34mhead.head_series.4.self_attn.out_proj.{bias, weight}[0m [34mhead.head_series.4.self_attn.{in_proj_bias, in_proj_weight}[0m [34mhead.head_series.5.bboxes_delta.{bias, weight}[0m [34mhead.head_series.5.block_time_mlp.1.{bias, weight}[0m [34mhead.head_series.5.class_logits.{bias, weight}[0m [34mhead.head_series.5.cls_module.0.weight[0m [34mhead.head_series.5.cls_module.1.{bias, weight}[0m [34mhead.head_series.5.inst_interact.dynamic_layer.{bias, weight}[0m [34mhead.head_series.5.inst_interact.norm1.{bias, weight}[0m [34mhead.head_series.5.inst_interact.norm2.{bias, weight}[0m [34mhead.head_series.5.inst_interact.norm3.{bias, weight}[0m [34mhead.head_series.5.inst_interact.out_layer.{bias, weight}[0m [34mhead.head_series.5.linear1.{bias, weight}[0m [34mhead.head_series.5.linear2.{bias, weight}[0m [34mhead.head_series.5.norm1.{bias, weight}[0m [34mhead.head_series.5.norm2.{bias, weight}[0m [34mhead.head_series.5.norm3.{bias, weight}[0m [34mhead.head_series.5.reg_module.0.weight[0m [34mhead.head_series.5.reg_module.1.{bias, weight}[0m [34mhead.head_series.5.reg_module.3.weight[0m [34mhead.head_series.5.reg_module.4.{bias, weight}[0m [34mhead.head_series.5.reg_module.6.weight[0m [34mhead.head_series.5.reg_module.7.{bias, weight}[0m [34mhead.head_series.5.self_attn.out_proj.{bias, weight}[0m [34mhead.head_series.5.self_attn.{in_proj_bias, in_proj_weight}[0m [34mhead.time_mlp.1.{bias, weight}[0m [34mhead.time_mlp.3.{bias, weight}[0m [34mlog_one_minus_alphas_cumprod[0m [34mposterior_log_variance_clipped[0m [34mposterior_mean_coef1[0m [34mposterior_mean_coef2[0m [34mposterior_variance[0m [34msqrt_alphas_cumprod[0m [34msqrt_one_minus_alphas_cumprod[0m [34msqrt_recip_alphas_cumprod[0m [34msqrt_recipm1_alphas_cumprod[0m [04/10 06:22:48] fvcore.common.checkpoint WARNING: The checkpoint state_dict contains keys that are not used by the model: [35mstem.fc.{bias, weight}[0m [04/10 06:22:48] d2.data.build INFO: Distribution of instances among all 7 categories: [36m	category	#instances	category	#instances	category	#instances
A220	247	A320/321	82	A330	27
ARJ21	378	Boeing737	252	Boeing787	261
other	715
total	1962					[0m

[04/10 06:22:48] d2.data.dataset_mapper INFO: [DatasetMapper] Augmentations used in inference: [ResizeShortestEdge(short_edge_length=(800, 800), max_size=1800, sample_style='choice')] [04/10 06:22:48] d2.data.common INFO: Serializing the dataset using: <class 'detectron2.data.common._TorchSerializedList'> [04/10 06:22:48] d2.data.common INFO: Serializing 442 elements to byte tensors and concatenating them all ... [04/10 06:22:48] d2.data.common INFO: Serialized dataset takes 0.23 MiB [04/10 06:22:48] d2.evaluation.coco_evaluation INFO: Fast COCO eval is not built. Falling back to official COCO eval. [04/10 06:22:48] d2.evaluation.coco_evaluation WARNING: COCO Evaluator instantiated using config, this is deprecated behavior. Please pass in explicit arguments instead. [04/10 06:22:48] d2.evaluation.coco_evaluation INFO: Trying to convert 'voc_2007_val' to COCO format ... [04/10 06:22:48] d2.data.datasets.coco WARNING: Using previously cached COCO format annotations at './output/inference/voc_2007_val_coco_format.json'. You need to clear the cache file if your dataset has been modified. [04/10 06:22:48] d2.evaluation.evaluator INFO: Start inference on 442 batches [04/10 06:22:50] d2.evaluation.evaluator INFO: Inference done 11/442. Dataloading: 0.0007 s/iter. Inference: 0.0793 s/iter. Eval: 0.0003 s/iter. Total: 0.0803 s/iter. ETA=0:00:34 [04/10 06:22:55] d2.evaluation.evaluator INFO: Inference done 73/442. Dataloading: 0.0012 s/iter. Inference: 0.0791 s/iter. Eval: 0.0003 s/iter. Total: 0.0807 s/iter. ETA=0:00:29 [04/10 06:23:00] d2.evaluation.evaluator INFO: Inference done 134/442. Dataloading: 0.0012 s/iter. Inference: 0.0799 s/iter. Eval: 0.0003 s/iter. Total: 0.0816 s/iter. ETA=0:00:25 [04/10 06:23:05] d2.evaluation.evaluator INFO: Inference done 196/442. Dataloading: 0.0013 s/iter. Inference: 0.0796 s/iter. Eval: 0.0003 s/iter. Total: 0.0813 s/iter. ETA=0:00:19 [04/10 06:23:10] d2.evaluation.evaluator INFO: Inference done 259/442. Dataloading: 0.0013 s/iter. Inference: 0.0794 s/iter. Eval: 0.0003 s/iter. Total: 0.0811 s/iter. ETA=0:00:14 [04/10 06:23:15] d2.evaluation.evaluator INFO: Inference done 322/442. Dataloading: 0.0013 s/iter. Inference: 0.0793 s/iter. Eval: 0.0003 s/iter. Total: 0.0810 s/iter. ETA=0:00:09 [04/10 06:23:20] d2.evaluation.evaluator INFO: Inference done 384/442. Dataloading: 0.0013 s/iter. Inference: 0.0793 s/iter. Eval: 0.0003 s/iter. Total: 0.0809 s/iter. ETA=0:00:04 [04/10 06:23:25] d2.evaluation.evaluator INFO: Total inference time: 0:00:35.390397 (0.080985 s / iter per device, on 1 devices) [04/10 06:23:25] d2.evaluation.evaluator INFO: Total inference pure compute time: 0:00:34 (0.079222 s / iter per device, on 1 devices) [04/10 06:23:25] d2.evaluation.coco_evaluation INFO: Preparing results for COCO format ... [04/10 06:23:25] d2.evaluation.coco_evaluation INFO: Saving results to ./output/inference/coco_eval_instances_results.json [04/10 06:23:25] d2.evaluation.coco_evaluation INFO: Evaluating predictions with official COCO API... [04/10 06:23:29] d2.evaluation.coco_evaluation INFO: Evaluation results for bbox:	AP	AP50	AP75	APs	APm	APl
0.000	0.000	0.000	nan	0.000	0.000

[04/10 06:23:29] d2.evaluation.coco_evaluation INFO: Some metrics cannot be computed and is shown as NaN. [04/10 06:23:29] d2.evaluation.coco_evaluation INFO: Per-category bbox AP:	AP	category	AP	category
A220	A320/321	0.000	A330	0.000
ARJ21	Boeing737	0.000	Boeing787	0.000
other

[04/10 06:23:29] d2.engine.defaults INFO: Evaluation results for voc_2007_val in csv format: [04/10 06:23:29] d2.evaluation.testing INFO: copypaste: Task: bbox [04/10 06:23:29] d2.evaluation.testing INFO: copypaste: AP,AP50,AP75,APs,APm,APl [04/10 06:23:29] d2.evaluation.testing INFO: copypaste: 0.0000,0.0000,0.0000,nan,0.0000,0.0000`

JoyeZLearning commented 5 months ago

It seems that there were something wrong with your data-processing, resulting the training invalid.

I‘ve uploaded my code 'voc2coco.py' to my repository and you can download it, hoping that can address your issue.

:）

2000YWQ commented 5 months ago

When I use your voc2coco.py, I still get the same result. This is the part of the json I converted： {"info": ["none"], "license": ["none"], "images": [{"file_name": "0000001.jpg", "height": 1500, "width": 1500, "id": 1}, {"file_name": "0000002.jpg", "height": 1200, "width": 1200, "id": 2}, {"file_name": "0000003.jpg", "height": 1200, "width": 1200, "id": 3}, {"file_name": "0000004.jpg", "height": 1500, "width": 1500, "id": 4}, {"file_name": "0000005.jpg", "height": 1200, "width": 1200, "id": 5}, {"file_name": "0000006.jpg", "height": 800, "width": 800, "id": 6}, {"file_name": "0000007.jpg", "height": 800, "width": 800, "id": 7}, {"file_name": "0000008.jpg", "height": 1000, "width": 1000, "id": 8}, {"file_name": "0000010.jpg", "height": 1500, "width": 1500, "id": 10}, {"file_name": "0000012.jpg", "height": 800, "width": 800, "id": 12}, {"file_name": "0000015.jpg", "height": 800, "width": 800, "id": 15}, {"file_name": "0000017.jpg", "height": 800, "width": 800, "id": 17}, {"file_name": "0000018.jpg", "height": 800, "width": 800, "id": 18}, {"file_name": "0000019.jpg", "height": 800, "width": 800, "id": 19}, {"file_name": "0000021.jpg", "height": 1500, "width": 1500, "id": 21}, {"file_name": "0000022.jpg", "height": 800, "width": 800, "id": 22}, {"file_name": "0000024.jpg", "height": 800, "width": 800, "id": 24}, {"file_name": "0000025.jpg", "height": 1200, "width": 1200, "id": 25}, {"file_name": "0000026.jpg", "height": 1000, "width": 1000, "id": 26}, {"file_name": "0000027.jpg", "height": 800, "width": 800, "id": 27}, {"file_name": "0000028.jpg", "height": 800, "width": 800, "id": 28}, {"file_name": "0000029.jpg", "height": 1500, "width": 1500, "id": 29}, {"file_name": "0000031.jpg", "height": 1500, "width": 1500, "id": 31}, {"file_name": "0000032.jpg", "height": 1200, "width": 1200, "id": 32},

GGyan commented 5 months ago

I also have the same problem, the calculated ap is all 0.I'm sure my labels are fine because they work fine on other models.

JoyeZLearning commented 5 months ago

Have you trained the model?.... you should train the model on your dataset and get the weight(.pth), then update it in config and default. From both of your results, I do not think you have run the codes rightly.......

JoyeZLearning / DiffDet4SAR

Can you provide the code for converting your dataset to COCO format? #2