Open stwerner97 opened 1 year ago
@stwerner97 The first way of writing is correct. https://github.com/open-mmlab/mmdetection/blob/dev-3.x/configs/mask2former/mask2former_r50_8xb2-lsj-50e_coco-panoptic.py#L181
Thanks for the quick response @hhaAndroid! 😊
The example you've linked does not work for me and does not use a ConcatDataset
. I've also double checked that the evaluation of both coco2017_val_dataset
and objects365_val_dataset
works alright if I don't use ConcatDataset
and instead train and evaluate on a single dataset. I've also checked that both datasets use consistent annotations, i.e., a single label and the same ID.
I've checked what issues are raised when I use the following evaluator
val_evaluator = [
dict(
_scope_="mmdet",
type="CocoMetric",
metric='bbox',
proposal_nums=(1, 10, 100)
)
]
and noticed that, although the COCOMetric
class raises the error AssertionError: ground truth is required for evaluation when `ann_file` is not provided
, the ground truth labels are available under another key. The lines of the COCOMetric
class that raise the issue are
assert 'instances' in data_sample, \
'ground truth is required for evaluation when ' \
'`ann_file` is not provided'
While instances
is not set in data_sample
, gt_instances
is. If I change the key (and do some changes in order to fit the expected downstream shape of the ground truth), the evaluation works for me.
assert 'gt_instances' in data_sample, \
'ground truth is required for evaluation when ' \
'`ann_file` is not provided'
gt['anns'] = []
boxes = data_sample['gt_instances']['bboxes'].detach().cpu().numpy()
labels = data_sample['gt_instances']['labels'].detach().cpu().numpy()
for bbox, label in zip(boxes, labels):
gt['anns'].append({'bbox': bbox, 'bbox_label': label})
@hhaAndroid could you confirm that the key gt_instances
indeed holds the ground-truth bbox and class labels? I'll later check if the implementation works when a single dataset is used, but the ann_file
isn't set in the evaluator.
Unfortunately, I don't think this works as expected, as the datasets (stored in coco format) could have overlapping image ids, which might give wrong results later on when aggregating the scores.
In essence, I think, one would need to ensure that the image ids are unique across datasets or the dataloader would need to give some indication what source dataset a sample belongs to.
I've checked what issues are raised when I use the following evaluator
val_evaluator = [ dict( _scope_="mmdet", type="CocoMetric", metric='bbox', proposal_nums=(1, 10, 100) ) ]
and noticed that, although the
COCOMetric
class raises the errorAssertionError: ground truth is required for evaluation when `ann_file` is not provided
, the ground truth labels are available under another key. The lines of theCOCOMetric
class that raise the issue areassert 'instances' in data_sample, \ 'ground truth is required for evaluation when ' \ '`ann_file` is not provided'
While
instances
is not set indata_sample
,gt_instances
is. If I change the key (and do some changes in order to fit the expected downstream shape of the ground truth), the evaluation works for me.assert 'gt_instances' in data_sample, \ 'ground truth is required for evaluation when ' \ '`ann_file` is not provided' gt['anns'] = [] boxes = data_sample['gt_instances']['bboxes'].detach().cpu().numpy() labels = data_sample['gt_instances']['labels'].detach().cpu().numpy() for bbox, label in zip(boxes, labels): gt['anns'].append({'bbox': bbox, 'bbox_label': label})
@hhaAndroid could you confirm that the key
gt_instances
indeed holds the ground-truth bbox and class labels? I'll later check if the implementation works when a single dataset is used, but theann_file
isn't set in the evaluator.
@hhaAndroid @stwerner97 I encountered the same problem as you.
AssertionError: Results do not correspond to current coco set
so, I modified the config files, just like this.
val_evaluator = [
dict(
_scope_="mmdet",
type='CocoMetric',
metric='bbox',
ann_file=[
"{}/test.json".format(data_rootsrsdd),
"{}/test.json".format(data_rootssdd),
"{}/test.json".format(data_rootrsdd),
"{}/test.json".format(data_rootdssdd)],
format_only=False,
backend_args=backend_args),
]
and then, I found the error was caused by the self._coco_api
,which can found in {root}\mmdetection\mmdet\evaluation\metrics\coco_metric.py
, because it only keeps the path of the last ann_file when we use the same COCOmetric type. So I modified the way of the codeself._coco_api= COCO(local_path)
gets, so that it can merge the json files at once. The code can found in pycocotools\coco.py
.
class COCO:
def __init__(self, annotation_file=None):
# load dataset
self.dataset, self.anns, self.cats, self.imgs = dict(), dict(), dict(), dict()
self.imgToAnns, self.catToImgs = defaultdict(list), defaultdict(list)
if not annotation_file == None:
if isinstance(annotation_file, list):
print('loading mutil-annotations into memory...')
# datasets = {}
json_contents = []
for ann in annotation_file:
with open(ann, "r") as f:
json_content = json.load(f)
# datasets.update(dataset)
json_contents.append(json_content)
print("d")
merged_images = []
merged_annotations = []
merged_info = []
merged_class = []
for _json in json_contents:
merged_images += _json["images"]
merged_annotations += _json["annotations"]
merged_info = _json["info"]
merged_class = _json["categories"]
datasets = {
"info": merged_info,
"categories": merged_class,
"images": merged_images,
"annotations": merged_annotations
}
self.dataset = datasets
self.createIndex()
else:
print('loading annotations into memory...')
tic = time.time()
with open(annotation_file, 'r') as f:
dataset = json.load(f)
assert type(dataset) == dict, 'annotation file format {} not supported'.format(type(dataset))
print('Done (t={:0.2f}s)'.format(time.time() - tic))
self.dataset = dataset
self.createIndex()
It can work, but the Recall was very low and Precision was zero. I don't know if it's a problem with the model or the code I modified, and I haven't had time to solve it yet. By the way, I considered your advise, and maked sure the each different image corresponds to a unique "img_id" in different json files. I am a fresh, but I hope this helps you.
I want to train and evaluate a detection model on multiple datasets at once. As preparation of the dataset, I use a joined label space with the same class label. How do I set up the evaluator when using multiple concatenated datasets? Looking at this pull request #3522, it seems that this use-case should be supported. I would be fine with both aggregated reports as well as dataset-specific evaluations.
I've tried to use separate evaluation metrics for each dataset.
This does not work and throws an
AssertionError: Results do not correspond to current coco set
error. I also tried an evaluator similar to this:which raises an error that the
ann_file
is missingAssertionError: ground truth is required for evaluation when `ann_file` is not provided
.Below are the relevant parts of my configuration file.
The setup is successfully training for the first epoch, but then throws an error upon evaluating.
Thanks for the great project! 😊