Open wangzhaoyang-508 opened 1 year ago
Would you like to provide your training configs for us, we need the model and training config, and could you tell us the loss function you used?
Would you like to provide your training configs for us, we need the model and training config, and could you tell us the loss function you used? I check the “ /data0/wangzhaoyang/detr/detrex/projects/dino/configs/models/dino_50.py” and find the models num_classes may wrong,it can works well now。
By the way, if i train from scratch, should i change the “bias=False” to “bias=True” in “detrex\detrex\modeling\backbone\resnet.py” ? Since isee the log about the model was e.g. “128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False”
Would you like to provide your training configs for us, we need the model and training config, and could you tell us the loss function you used?
I check the “ /data0/wangzhaoyang/detr/detrex/projects/dino/configs/models/dino_50.py” and find the models num_classes may wrong,it can works well now。
By the way, if i train from scratch, should i change the “bias=False” to “bias=True” in “detrex\detrex\modeling\backbone\resnet.py” ? Since isee the log about the model was e.g. “128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False”
Would you like to provide your training configs for us, we need the model and training config, and could you tell us the loss function you used?
I check the “ /data0/wangzhaoyang/detr/detrex/projects/dino/configs/models/dino_50.py” and find the models num_classes may wrong,it can works well now。
By the way, if i train from scratch, should i change the “bias=False” to “bias=True” in “detrex\detrex\modeling\backbone\resnet.py” ? Since isee the log about the model was e.g. “128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False”
training from scratch means you're training with random initialized backbone or imagenet pretrained backbone~
I suggest not to update the backbone configuration here
Would you like to provide your training configs for us, we need the model and training config, and could you tell us the loss function you used? I check the “ /data0/wangzhaoyang/detr/detrex/projects/dino/configs/models/dino_50.py” and find the models num_classes may wrong,it can works well now。
By the way, if i train from scratch, should i change the “bias=False” to “bias=True” in “detrex\detrex\modeling\backbone\resnet.py” ? Since isee the log about the model was e.g. “128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False”
BTW, we actually used the detectron2 resnet in our config, we did not use the resnet that was re-implemented in detrex.
The reason why we re-implement resnet model in detrex is that in detectron2 original implementation, it's not easy to set dilation=2
in the last-stage (hard to build ResNet-DC5 model)
Thank you so much, since our custom dataset is totally different from imagenet or coco,we want train from scratch to see if it can get better performance。 so ,how should i modify the config to make the backbone update in training?
Did the “128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False”in log means the backbone are not update?
Thank you so much, since our custom dataset is totally different from imagenet or coco,we want train from scratch to see if it can get better performance。 so ,how should i modify the config to make the backbone update in training?
Did the “128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False”in log means the backbone are not update?
Yes, I think it's not update~ maybe you can update your config like this to see if it works:
model = L(DINO)(
backbone=L(ResNet)(
stem=L(BasicStem)(in_channels=3, out_channels=64, norm="FrozenBN"),
stages=L(ResNet.make_default_stages)(
depth=50,
stride_in_1x1=False,
norm="FrozenBN",
bias=True # add this one
),
out_features=["res3", "res4", "res5"],
freeze_at=1,
),
Thank you so much, your answer in really helpful。
But modify the config by adding “bias=True” can not work,since the upper wrapper do not have the arg “ bias” 。 I think change the “bias=False” to “bias=True” in “detrex\detrex\modeling\backbone\resnet.py” or “detectron2\modeling\backbone\resnet.py” may the only way to changge。
By the way ,I learn that it is useless to change “bias=True” before BN, since the BN will remove the bias effect。 But,if I use GN or LN ,“bias=True” maybe useful, doesn't it?
Thank you so much, your answer in really helpful。
But modify the config by adding “bias=True” can not work,since the upper wrapper do not have the arg “ bias” 。 I think change the “bias=False” to “bias=True” in “detrex\detrex\modeling\backbone\resnet.py” or “detectron2\modeling\backbone\resnet.py” may the only way to changge。
By the way ,I learn that it is useless to change “bias=True” before BN, since the BN will remove the bias effect。 But,if I use GN or LN ,“bias=True” maybe useful, doesn't it?
yes, it may be useful, it's better to do some experiments on these modifications~
Excuse me,If I only want to detect a subset of categories(such like maybe 10 classes in COCO),how should I modify the config?
Excuse me,If I only want to detect a subset of categories(such like maybe 10 classes in COCO),how should I modify the config?
just modify the num_classes
is OK~
Excuse me,If I only want to detect a subset of categories(such like maybe 10 classes in COCO),how should I modify the config?
just modify the
num_classes
is OK~
Then,how can I choose the class name or classes id? ( the names or ids of the classes which I only interest) Since I only want to detect a few classes in my custom dataset。
Excuse me,If I only want to detect a subset of categories(such like maybe 10 classes in COCO),how should I modify the config?
just modify the
num_classes
is OK~Then,how can I choose the class name or classes id? ( the names or ids of the classes which I only interest) Since I only want to detect a few classes in my custom dataset。
I think maybe you should firstly convert your dataset into coco format, then d2 will help you to handle the other things~
Excuse me,If I only want to detect a subset of categories(such like maybe 10 classes in COCO),how should I modify the config?
just modify the
num_classes
is OK~Then,how can I choose the class name or classes id? ( the names or ids of the classes which I only interest) Since I only want to detect a few classes in my custom dataset。
I think maybe you should firstly convert your dataset into coco format, then d2 will help you to handle the other things~
You can refer to this config, just register you own dataset in two lines: https://github.com/IDEA-Research/detrex/blob/main/configs/common/data/custom.py
Thank you so much your reply is really quick^_^
I successfully registered a custom dataset with 8 classes of objects. And it can be trained well.
Now, 4 of the 8 classes in my custom dataset is no need to be detected anymore。
I don't want to change the json file, so how do I re-register, so that the model only focuses on the four classes of useful objects when it trained?
in detrex I tried to use the “MetadataCatalog.get” but it can not help the error is AssertionError: Attribute 'thing_classes' in the metadata of 'my_eldataset_train' cannot be set to a different value! ['duanshan', 'heiban', 'xianzhuangquexian', 'xuhan'] != ['duanshan', 'heiban', 'xianzhuangquexian', 'xuhan', 'yinlie', 'crush', 'finger', 'star']
custom.py codes import itertools from omegaconf import OmegaConf import detectron2.data.transforms as T from detectron2.config import LazyCall as L from detectron2.data import ( build_detection_test_loader, build_detection_train_loader, get_detection_dataset_dicts, MetadataCatalog, ) from detectron2.data.datasets import register_coco_instances from detectron2.evaluation import COCOEvaluator
from detrex.data import DetrDatasetMapper
dataloader = OmegaConf.create()
register_coco_instances("my_eldataset_train", {}, '/data1/wzydatasets/yuanle/coco/annotations/instances_train2017.json', '/data1/wzydatasets/yuanle/coco/train2017/') register_coco_instances("my_eldataset_test", {}, '/data1/wzydatasets/yuanle/coco/annotations/instances_test2017.json', '/data1/wzydatasets/yuanle/coco/test2017/')
MetadataCatalog.get("my_eldataset_train").thing_classes = ['duanshan', 'heiban', 'xianzhuangquexian', 'xuhan'] MetadataCatalog.get("my_eldataset_test").thing_classes = ['duanshan', 'heiban', 'xianzhuangquexian', 'xuhan']
dataloader.train = L(build_detection_train_loader)( dataset=L(get_detection_dataset_dicts)(names="my_eldataset_train"), mapper=L(DetrDatasetMapper)( augmentation=[ L(T.ResizeShortestEdge)( short_edge_length=600, max_size=600, ), L(T.RandomFlip)(), L(T.ResizeShortestEdge)( short_edge_length=(320, 480, 512, 544, 576, 608,),
max_size=640, sample_style="choice", ), ], augmentation_with_crop=[ L(T.RandomFlip)(), L(T.ResizeShortestEdge)( short_edge_length=600, max_size=600, ), L(T.RandomCrop)( crop_type="absolute_range", crop_size=(300, 400), ), L(T.ResizeShortestEdge)( short_edge_length=(320, 480, 512, 544, 576, 608,), max_size=640, sample_style="choice", ), ], is_train=True, mask_on=False, img_format="RGB", ), total_batch_size=16, num_workers=4, )
dataloader.test = L(build_detection_test_loader)( dataset=L(get_detection_dataset_dicts)(names="my_eldataset_test", filter_empty=False), mapper=L(DetrDatasetMapper)( augmentation=[ L(T.ResizeShortestEdge)( short_edge_length=600, max_size=640, ), ], augmentation_with_crop=None, is_train=False, mask_on=False, img_format="RGB", ), num_workers=4, )
dataloader.evaluator = L(COCOEvaluator)( dataset_name="${..test.dataset.names}", )
MetadataCatalog.get("my_eldataset_train").thing_classes = ['duanshan', 'heiban', 'xianzhuangquexian', 'xuhan'] MetadataCatalog.get("my_eldataset_test").thing_classes = ['duanshan', 'heiban', 'xianzhuangquexian', 'xuhan']
I just add these two lines of code,since those are the only 4 I'm interested in now。but it can not work。
We will check this issue later~
MetadataCatalog.get("my_eldataset_train").thing_classes = ['duanshan', 'heiban', 'xianzhuangquexian', 'xuhan'] MetadataCatalog.get("my_eldataset_test").thing_classes = ['duanshan', 'heiban', 'xianzhuangquexian', 'xuhan']
I just add these two lines of code,since those are the only 4 I'm interested in now。but it can not work。
Seems like it's not suitable to directly change the attribute from the get
function, it's a python syntax error.
Did you solve the problem now~ @wangzhaoyang-508
Where is this parameter modified?
I use my custom dataset to train, but when i finish a eval (not first eval ,but the maybe 7th) an "AssertionError" occur : AssertionError: A prediction has class=6, but the dataset only has 5 classes and predicted class id should be in 0, 4].
I checked my train.json and test.josn it only has 5 classes,Why are there six categories of predictions? How can i fix it?
The log is:
eta: 21:01:51 iter: 13949 total_loss: 12.1 loss_class: 0.1947 loss_bbox: 004829 loss_giou: 0.8478 loss_class_0: 0.2712 loss_bbox_0: 0.05169 loss_giou_0: 0.8324 loss_class_1: 0.2247 oss_bbox_1: 0.05147 loss_giou_1: 0.8432 loss_class_2: 0.2089 loss_bbox_2: 0.05045 loss_giou_2: 0.8478 loss_cass_3: 0.1921 loss_bbox_3: 0.04827 loss_giou_3: 0.8465 loss_class_4: 0.1926 loss_bbox_4: 0.04828 loss_giou_4 0.8471 loss_class_enc: 0.2915 loss_bbox_enc: 0.05811 loss_giou_enc: 0.9053 loss_class_dn: 0.01086 loss_bboxdn: 0.03106 loss_giou_dn: 0.6482 loss_class_dn_0: 0.0476 loss_bbox_dn_0: 0.04128 loss_giou_dn_0: 0.7782 lossclass_dn_1: 0.01995 loss_bbox_dn_1: 0.03258 loss_giou_dn_1: 0.6654 loss_class_dn_2: 0.01299 loss_bbox_dn_2: 003104 loss_giou_dn_2: 0.6437 loss_class_dn_3: 0.01152 loss_bbox_dn_3: 0.03102 loss_giou_dn_3: 0.6449 loss_clss_dn_4: 0.01137 loss_bbox_dn_4: 0.03105 loss_giou_dn_4: 0.6464 time: 0.7148 data_time: 0.0094 lr: 0.0001 mx_mem: 24711M [02/26 23:01:59 detectron2]: Run evaluation without EMA. WARNING [02/26 23:01:59 d2.data.datasets.coco]: Category ids in annotations are not in [1, #categories]! We'll apply a mapping for you.
[02/26 23:01:59 d2.data.datasets.coco]: Loaded 1653 images in COCO format from /data0/wangzhaoyang/data/smy/COCO/nnotations/instances_test2017.json [02/26 23:01:59 d2.data.common]: Serializing 1653 elements to byte tensors and concatenating them all ... [02/26 23:01:59 d2.data.common]: Serialized dataset takes 0.36 MiB [02/26 23:01:59 d2.evaluation.evaluator]: Start inference on 414 batches [02/26 23:02:09 d2.evaluation.evaluator]: Inference done 11/414. Dataloading: 0.0008 s/iter. Inference: 0.0722 s/ter. Eval: 0.0005 s/iter. Total: 0.0735 s/iter. ETA=0:00:29 [02/26 23:02:14 d2.evaluation.evaluator]: Inference done 82/414. Dataloading: 0.0011 s/iter. Inference: 0.0691 s/ter. Eval: 0.0005 s/iter. Total: 0.0708 s/iter. ETA=0:00:23 [02/26 23:02:19 d2.evaluation.evaluator]: Inference done 155/414. Dataloading: 0.0011 s/iter. Inference: 0.0685 siter. Eval: 0.0005 s/iter. Total: 0.0701 s/iter. ETA=0:00:18 [02/26 23:02:24 d2.evaluation.evaluator]: Inference done 225/414. Dataloading: 0.0011 s/iter. Inference: 0.0691 siter. Eval: 0.0005 s/iter. Total: 0.0707 s/iter. ETA=0:00:13 [02/26 23:02:29 d2.evaluation.evaluator]: Inference done 297/414. Dataloading: 0.0011 s/iter. Inference: 0.0688 siter. Eval: 0.0005 s/iter. Total: 0.0704 s/iter. ETA=0:00:08 [02/26 23:02:34 d2.evaluation.evaluator]: Inference done 366/414. Dataloading: 0.0011 s/iter. Inference: 0.0688 siter. Eval: 0.0010 s/iter. Total: 0.0709 s/iter. ETA=0:00:03 [02/26 23:02:38 d2.evaluation.evaluator]: Total inference time: 0:00:29.381724 (0.071838 s / iter per device, on devices) [02/26 23:02:38 d2.evaluation.evaluator]: Total inference pure compute time: 0:00:28 (0.068535 s / iter per devic, on 4 devices) [02/26 23:02:42 d2.evaluation.coco_evaluation]: Preparing results for COCO format ... ERROR [02/26 23:02:42 d2.engine.train_loop]: Exception during training: Traceback (most recent call last): File "/data0/wangzhaoyang/detr/detrex/detectron2/detectron2/engine/train_loop.py", line 150, in train self.after_step() File "/data0/wangzhaoyang/detr/detrex/detectron2/detectron2/engine/train_loop.py", line 180, in after_step h.after_step() File "/data0/wangzhaoyang/detr/detrex/detectron2/detectron2/engine/hooks.py", line 555, in after_step self._do_eval() File "/data0/wangzhaoyang/detr/detrex/detectron2/detectron2/engine/hooks.py", line 528, in _do_eval results = self._func() File "/data0/wangzhaoyang/detr/detrex/tools/train_net.py", line 258, in
hooks.EvalHook(cfg.train.eval_period, lambda: do_test(cfg, model)),
File "/data0/wangzhaoyang/detr/detrex/tools/train_net.py", line 167, in do_test
ret = inference_on_dataset(
File "/data0/wangzhaoyang/detr/detrex/detectron2/detectron2/evaluation/evaluator.py", line 204, in inference_ondataset
results = evaluator.evaluate()
File "/data0/wangzhaoyang/detr/detrex/detectron2/detectron2/evaluation/coco_evaluation.py", line 206, in evaluae
self._eval_predictions(predictions, img_ids=img_ids)
File "/data0/wangzhaoyang/detr/detrex/detectron2/detectron2/evaluation/coco_evaluation.py", line 240, in _eval_redictions
assert category_id < num_classes, (
AssertionError: A prediction has class=6, but the dataset only has 5 classes and predicted class id should be in 0, 4].
[02/26 23:02:42 d2.engine.hooks]: Overall training speed: 13997 iterations in 2:46:46 (0.7149 s / it)
[02/26 23:02:42 d2.engine.hooks]: Total training time: 2:52:35 (0:05:49 on hooks)
[02/26 23:02:42 d2.utils.events]: eta: 20:58:55 iter: 13999 total_loss: 11.72 loss_class: 0.1678 loss_bbox: .04619 loss_giou: 0.8291 loss_class_0: 0.2467 loss_bbox_0: 0.04601 loss_giou_0: 0.8328 loss_class_1: 0.201 oss_bbox_1: 0.04803 loss_giou_1: 0.8129 loss_class_2: 0.1722 loss_bbox_2: 0.04691 loss_giou_2: 0.818 loss_clss_3: 0.1798 loss_bbox_3: 0.0458 loss_giou_3: 0.8111 loss_class_4: 0.1709 loss_bbox_4: 0.04619 loss_giou_4: .8276 loss_class_enc: 0.2543 loss_bbox_enc: 0.05304 loss_giou_enc: 0.9121 loss_class_dn: 0.01214 loss_bbox_d: 0.02816 loss_giou_dn: 0.6193 loss_class_dn_0: 0.04826 loss_bbox_dn_0: 0.0378 loss_giou_dn_0: 0.7667 loss_cass_dn_1: 0.01807 loss_bbox_dn_1: 0.02934 loss_giou_dn_1: 0.6113 loss_class_dn_2: 0.01376 loss_bbox_dn_2: 0.0801 loss_giou_dn_2: 0.607 loss_class_dn_3: 0.01261 loss_bbox_dn_3: 0.02806 loss_giou_dn_3: 0.6095 loss_classdn_4: 0.01219 loss_bbox_dn_4: 0.02811 loss_giou_dn_4: 0.6143 time: 0.7148 data_time: 0.0090 lr: 0.0001 max_em: 24711M
wandb: Waiting for W&B process to finish... (success).
wandb: Network error (ConnectTimeout), entering retry loop.
wandb:
wandb: Run history:
wandb: bbox/AP ▁▃▆▇▇█
wandb: bbox/AP-bengbian ▁▃▆▆▆█
wandb: bbox/AP-duanshan ▁▂▆██▇
wandb: bbox/AP-loujiang ▁▄▇▇▆█
wandb: bbox/AP-yinxu ▁▁▁▅█▃
wandb: bbox/AP-zangpian ▁▄▅▇▇█
wandb: bbox/AP50 ▁▄▆▇▇█
wandb: bbox/AP75 ▁▃▆█▆█
wandb: bbox/APl ▁▄▆▆▆█
wandb: bbox/APm ▁▄▆▆▆█
wandb: bbox/APs ▁▃▆▇██
wandb: data_time ▂▆▂▄▃▆▄▅▃▅▅▅▁▄▃▄▇▅▄█▇▆▆▃▆▅▅▆▄▄▅▅▅▄▅▄▆█▆▄
wandb: eta_seconds █▇███▇▇▇▇▇▆▆▆▆▆▆▅▅▅▅▅▄▄▄▄▃▃▃▃▃▃▃▂▂▁▁▁▁▂▁
wandb: loss_bbox ██▆▄▃▃▂▂▂▂▂▂▂▂▂▂▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb: loss_bbox_0 █▇▆▃▂▃▂▂▂▂▂▂▂▂▂▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb: loss_bbox_1 █▇▆▃▃▃▂▂▂▂▂▂▂▂▂▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb: loss_bbox_2 ██▆▄▃▃▂▂▂▂▂▂▂▂▂▂▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb: loss_bbox_3 ██▆▄▃▃▂▂▂▂▂▂▂▂▂▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb: loss_bbox_4 ██▆▄▃▃▂▂▂▂▂▂▂▂▂▂▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb: loss_bbox_dn ▆▇█▇▆▆▄▅▄▄▅▄▃▄▃▃▃▃▃▂▂▂▂▂▂▂▂▂▂▂▁▂▂▁▁▂▁▁▁▁
wandb: loss_bbox_dn_0 ▆▇█▇▇▆▄▅▄▅▅▃▃▄▃▃▃▃▂▂▃▂▂▂▂▃▂▂▂▂▁▂▂▁▁▂▁▁▁▁
wandb: loss_bbox_dn_1 ▆▇█▇▆▆▄▅▄▄▅▄▃▄▃▃▃▃▃▂▂▂▂▂▂▂▂▂▂▂▁▂▂▁▁▂▁▁▁▁
wandb: loss_bbox_dn_2 ▆▇█▇▆▆▄▅▄▄▅▄▃▄▃▃▃▃▃▂▂▂▂▂▂▂▂▂▂▂▁▂▂▁▁▂▁▁▁▁
wandb: loss_bbox_dn_3 ▆▇█▇▆▆▄▅▄▄▅▄▃▄▃▃▃▃▃▂▂▂▂▂▂▂▂▂▂▂▁▂▂▁▁▂▁▁▁▁
wandb: loss_bbox_dn_4 ▆▇█▇▆▆▄▅▄▄▅▄▃▄▃▃▃▃▃▂▂▂▂▂▂▂▂▂▂▂▁▂▂▁▁▂▁▁▁▁
wandb: loss_bbox_enc █▇▅▄▃▃▂▂▂▂▂▂▂▂▂▂▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb: loss_class ███▆▅▅▅▄▅▄▄▃▃▃▃▃▂▃▃▂▂▂▂▂▂▂▂▂▂▂▂▁▁▂▁▁▁▁▁▁
wandb: loss_class_0 ███▆▅▅▅▅▄▄▄▃▃▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▁▁▂▁▁▁▁▁▁
wandb: loss_class_1 ███▆▅▅▅▄▄▄▄▃▃▃▃▃▂▃▃▂▂▃▂▂▂▂▂▂▂▂▂▁▁▂▁▁▁▁▁▁
wandb: loss_class_2 ███▆▅▅▅▅▅▄▄▃▃▃▃▃▂▂▃▂▂▃▂▂▂▂▁▂▂▂▂▁▂▂▁▁▁▁▁▁
wandb: loss_class_3 ███▆▅▅▅▅▅▄▄▃▃▃▃▃▂▂▃▂▂▃▂▂▂▂▁▂▂▂▂▁▂▂▁▁▁▁▁▁
wandb: loss_class_4 ███▆▅▅▅▅▅▄▄▃▃▃▃▃▂▂▃▂▂▂▂▂▂▂▂▂▂▂▂▁▂▂▁▁▁▁▁▁
wandb: loss_class_dn █▅▅▄▄▄▃▃▃▃▃▂▂▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb: loss_class_dn_0 █▄▄▄▃▃▃▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb: loss_class_dn_1 █▅▅▄▄▄▄▃▃▃▃▃▂▂▂▂▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb: loss_class_dn_2 █▅▅▅▄▄▄▃▃▃▃▂▂▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb: loss_class_dn_3 █▅▅▅▄▄▄▃▃▃▃▃▂▂▂▂▂▂▂▂▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb: loss_class_dn_4 █▅▅▄▄▄▄▃▃▃▃▂▂▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb: loss_class_enc ███▆▅▅▄▄▄▄▄▃▃▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▁▁▂▁▁▁▁▁▁
wandb: loss_giou ██▇▅▅▅▄▄▄▄▄▄▃▃▃▃▃▃▂▂▂▂▂▁▂▂▁▂▁▁▁▁▁▁▁▁▁▁▁▁
wandb: loss_giou_0 ██▇▅▅▅▄▄▄▄▄▃▃▃▃▃▃▃▂▂▂▂▂▂▂▂▁▂▁▁▁▁▁▁▁▁▁▁▁▁
wandb: loss_giou_1 ██▇▅▅▅▄▄▄▄▄▄▃▃▃▃▃▃▂▂▂▂▂▂▂▂▁▂▁▁▁▁▁▁▁▁▁▁▁▁
wandb: loss_giou_2 ██▇▅▅▅▄▄▄▄▄▄▃▃▃▃▃▃▂▂▂▂▂▁▂▂▁▂▁▁▁▁▁▁▁▁▁▁▁▁
wandb: loss_giou_3 ██▇▅▅▅▄▄▄▄▄▄▃▃▃▃▃▃▂▂▂▂▂▁▂▂▁▂▁▁▁▁▁▁▁▁▁▁▁▁
wandb: loss_giou_4 ██▇▅▅▅▄▄▄▄▄▄▃▃▃▃▃▃▂▂▂▂▂▁▂▂▁▂▁▁▁▁▁▁▁▁▁▁▁▁
wandb: loss_giou_dn ████▇▇▆▅▅▅▅▅▄▄▄▄▃▃▃▃▃▃▂▂▂▂▂▂▁▂▁▁▁▁▁▁▁▁▁▁
wandb: loss_giou_dn_0 ████▇▇▆▆▅▅▅▅▄▄▄▄▃▃▃▃▃▃▃▂▂▂▂▂▁▂▁▁▂▂▁▁▁▁▁▁
wandb: loss_giou_dn_1 ████▇▇▆▅▅▅▅▅▄▄▄▄▃▃▃▃▃▃▃▂▂▂▂▂▂▂▂▁▂▂▁▁▁▁▁▁
wandb: loss_giou_dn_2 ████▇▇▆▅▅▅▅▅▄▄▄▄▃▃▃▃▃▃▂▂▂▂▂▂▁▂▁▁▁▂▁▁▁▁▁▁
wandb: loss_giou_dn_3 ████▇▇▆▅▅▅▅▅▄▄▄▄▃▃▃▃▃▃▂▂▂▂▂▂▁▂▁▁▁▂▁▁▁▁▁▁
wandb: loss_giou_dn_4 ████▇▇▆▅▅▅▅▅▄▄▄▄▃▃▃▃▃▃▂▂▂▂▂▂▁▂▁▁▁▁▁▁▁▁▁▁
wandb: loss_giou_enc ██▇▅▅▅▄▄▄▄▄▄▃▃▃▃▃▃▃▂▂▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁
wandb: lr ▁▃▆█████████████████████████████████████
wandb: time ▂▅█▇▃▁▄▄▂▅▅▄▅▄▅▅▂▇▄▅▅▇▄▇▄▃▇▂▇▂▇▅▅▆▄▂▄▆▆▄
wandb: total_loss ██▇▆▅▅▄▄▄▄▄▃▃▃▃▃▃▃▂▂▂▂▂▂▂▂▁▂▁▁▁▁▁▁▁▁▁▁▁▁
wandb:
wandb: Run summary:
wandb: bbox/AP 13.45007
wandb: bbox/AP-bengbian 12.02601
wandb: bbox/AP-duanshan 8.42912
wandb: bbox/AP-loujiang 19.63638
wandb: bbox/AP-yinxu 0.85431
wandb: bbox/AP-zangpian 26.30452
wandb: bbox/AP50 43.96051
wandb: bbox/AP75 3.76077
wandb: bbox/APl 44.22862
wandb: bbox/APm 18.40223
wandb: bbox/APs 10.8132
wandb: data_time 0.00908
wandb: eta_seconds 75535.43819
wandb: loss_bbox 0.04619
wandb: loss_bbox_0 0.04601
wandb: loss_bbox_1 0.04803
wandb: loss_bbox_2 0.04691
wandb: loss_bbox_3 0.0458
wandb: loss_bbox_4 0.04619
wandb: loss_bbox_dn 0.02816
wandb: loss_bbox_dn_0 0.0378
wandb: loss_bbox_dn_1 0.02934
wandb: loss_bbox_dn_2 0.02801
wandb: loss_bbox_dn_3 0.02806
wandb: loss_bbox_dn_4 0.02811
wandb: loss_bbox_enc 0.05304
wandb: loss_class 0.1678
wandb: loss_class_0 0.24672
wandb: loss_class_1 0.20097
wandb: loss_class_2 0.17215
wandb: loss_class_3 0.17976
wandb: loss_class_4 0.17093
wandb: loss_class_dn 0.01214
wandb: loss_class_dn_0 0.04826
wandb: loss_class_dn_1 0.01807
wandb: loss_class_dn_2 0.01376
wandb: loss_class_dn_3 0.01261
wandb: loss_class_dn_4 0.01219
wandb: loss_class_enc 0.25431
wandb: loss_giou 0.82908
wandb: loss_giou_0 0.8328
wandb: loss_giou_1 0.81291
wandb: loss_giou_2 0.81798
wandb: loss_giou_3 0.81106
wandb: loss_giou_4 0.82758
wandb: loss_giou_dn 0.61925
wandb: loss_giou_dn_0 0.76666
wandb: loss_giou_dn_1 0.61127
wandb: loss_giou_dn_2 0.60695
wandb: loss_giou_dn_3 0.60949
wandb: loss_giou_dn_4 0.61434
wandb: loss_giou_enc 0.91206
wandb: lr 0.0001
wandb: time 0.69452
wandb: total_loss 11.71719
wandb:
wandb: Synced detrex_experiment1: https://wandb.ai/wangzhaoyang/detrex/runs/3evjnbxb
wandb: Synced 6 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s)
wandb: Find logs at: ./wandb_output/wandb/run-20230226_200920-3evjnbxb/logs
Traceback (most recent call last):
File "tools/train_net.py", line 307, in
launch(
File "/data0/wangzhaoyang/detr/detrex/detectron2/detectron2/engine/launch.py", line 67, in launch
mp.spawn(
File "/home/amax/anaconda3/envs/wangzydetrex/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 30, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "/home/amax/anaconda3/envs/wangzydetrex/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 88, in start_processes
while not context.join():
File "/home/amax/anaconda3/envs/wangzydetrex/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 50, in join
raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException:
-- Process 0 terminated with the following error: Traceback (most recent call last): File "/home/amax/anaconda3/envs/wangzydetrex/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 9, in _wrap fn(i, args) File "/data0/wangzhaoyang/detr/detrex/detectron2/detectron2/engine/launch.py", line 126, in _distributed_worker main_func(args) File "/data0/wangzhaoyang/detr/detrex/tools/train_net.py", line 302, in main do_train(args, cfg) File "/data0/wangzhaoyang/detr/detrex/tools/train_net.py", line 275, in do_train trainer.train(start_iter, cfg.train.max_iter) File "/data0/wangzhaoyang/detr/detrex/detectron2/detectron2/engine/train_loop.py", line 150, in train self.after_step() File "/data0/wangzhaoyang/detr/detrex/detectron2/detectron2/engine/train_loop.py", line 180, in after_step h.after_step() File "/data0/wangzhaoyang/detr/detrex/detectron2/detectron2/engine/hooks.py", line 555, in after_step self._do_eval() File "/data0/wangzhaoyang/detr/detrex/detectron2/detectron2/engine/hooks.py", line 528, in _do_eval results = self._func() File "/data0/wangzhaoyang/detr/detrex/tools/train_net.py", line 258, in
hooks.EvalHook(cfg.train.eval_period, lambda: do_test(cfg, model)),
File "/data0/wangzhaoyang/detr/detrex/tools/train_net.py", line 167, in do_test
ret = inference_on_dataset(
File "/data0/wangzhaoyang/detr/detrex/detectron2/detectron2/evaluation/evaluator.py", line 204, in inference_ondataset
results = evaluator.evaluate()
File "/data0/wangzhaoyang/detr/detrex/detectron2/detectron2/evaluation/coco_evaluation.py", line 206, in evaluae
self._eval_predictions(predictions, img_ids=img_ids)
File "/data0/wangzhaoyang/detr/detrex/detectron2/detectron2/evaluation/coco_evaluation.py", line 240, in _eval_redictions
assert category_id < num_classes, (
AssertionError: A prediction has class=6, but the dataset only has 5 classes and predicted class id should be in 0, 4].