Deci-AI / super-gradients

Easily train or fine-tune SOTA computer vision models with one open source training library. The home of Yolo-NAS.
https://www.supergradients.com
Apache License 2.0
4.53k stars 490 forks source link

EmptyDatasetException: Empty Dataset: Out of 473 images, not a single one was found with any of these classes: None #1050

Open chiragubnare opened 1 year ago

chiragubnare commented 1 year ago

šŸ› Describe the bug


EmptyDatasetException Traceback (most recent call last) Cell In[17], line 3 1 from IPython.display import clear_output ----> 3 train_data = coco_detection_yolo_format_train( 4 dataset_params={ 5 'data_dir': dataset_params['data_dir'], 6 'images_dir': dataset_params['train_images_dir'], 7 'labels_dir': dataset_params['train_labels_dir'], 8 'classes': dataset_params['classes'] 9 }, 10 dataloader_params={ 11 'batch_size':16, 12 'num_workers':2 13 } 14 ) 16 val_data = coco_detection_yolo_format_val( 17 dataset_params={ 18 'data_dir': dataset_params['data_dir'], (...) 26 } 27 ) 29 test_data = coco_detection_yolo_format_val( 30 dataset_params={ 31 'data_dir': dataset_params['data_dir'], (...) 39 } 40 )

File ~\anaconda3\envs\pytogpu\lib\site-packages\super_gradients\training\dataloaders\dataloaders.py:275, in coco_detection_yolo_format_train(dataset_params, dataloader_params) 273 @register_dataloader(Dataloaders.COCO_DETECTION_YOLO_FORMAT_TRAIN) 274 def coco_detection_yolo_format_train(dataset_params: Dict = None, dataloader_params: Dict = None) -> DataLoader: --> 275 return get_data_loader( 276 config_name="coco_detection_yolo_format_base_dataset_params", 277 dataset_cls=YoloDarknetFormatDetectionDataset, 278 train=True, 279 dataset_params=dataset_params, 280 dataloader_params=dataloader_params, 281 )

File ~\anaconda3\envs\pytogpu\lib\site-packages\super_gradients\training\dataloaders\dataloaders.py:72, in get_data_loader(config_name, dataset_cls, train, dataset_params, dataloader_params) 70 local_rank = get_local_rank() 71 with wait_for_the_master(local_rank): ---> 72 dataset = dataset_cls(**dataset_params) 73 if not hasattr(dataset, "dataset_params"): 74 dataset.dataset_params = dataset_params

File ~\anaconda3\envs\pytogpu\lib\site-packages\super_gradients\training\datasets\detection_datasets\yolo_format_detection.py:122, in YoloDarknetFormatDetectionDataset.init(self, data_dir, images_dir, labels_dir, classes, class_ids_to_ignore, ignore_invalid_labels, show_all_warnings, *args, *kwargs) 120 kwargs["output_fields"] = ["image", "target"] 121 kwargs["original_target_format"] = XYXY_LABEL # We convert yolo format (LABEL_CXCYWH) to Coco format (XYXY_LABEL) when loading the annotation --> 122 super().init(data_dir=data_dir, show_all_warnings=show_all_warnings, args, **kwargs)

File ~\anaconda3\envs\pytogpu\lib\site-packages\super_gradients\common\decorators\factory_decorator.py:36, in resolve_param..inner..wrapper(*args, *kwargs) 34 new_value = factory.get(args[index]) 35 args = _assign_tuple(args, index, new_value) ---> 36 return func(args, **kwargs)

File ~\anaconda3\envs\pytogpu\lib\site-packages\super_gradients\training\datasets\detection_datasets\detection_dataset.py:155, in DetectionDataset.init(self, data_dir, original_target_format, max_num_samples, cache, cache_dir, input_dim, transforms, all_classes_list, class_inclusion_list, ignore_empty_annotations, target_fields, output_fields, verbose, show_all_warnings) 152 raise KeyError('"target" is expected to be in the fields to subclass but it was not included') 154 self._required_annotation_fields = {"target", "img_path", "resized_img_shape"} --> 155 self.annotations = self._cache_annotations() 157 self.cache = cache 158 self.cache_dir = cache_dir

File ~\anaconda3\envs\pytogpu\lib\site-packages\super_gradients\training\datasets\detection_datasets\detection_dataset.py:216, in DetectionDataset._cache_annotations(self) 213 annotations.append(img_annotation) 215 if len(annotations) == 0: --> 216 raise EmptyDatasetException( 217 f"Out of {self.n_available_samples} images, not a single one was found with any of these classes: {self.class_inclusion_list}" 218 ) 220 if n_invalid_bbox > 0: 221 logger.warning(f"Found {n_invalid_bbox} invalid bbox that were ignored. For more information, please set show_all_warnings=True.")

EmptyDatasetException: Empty Dataset: Out of 473 images, not a single one was found with any of these classes: None

Versions

YOLO_NAS_L

dagshub[bot] commented 1 year ago

Join the discussion on DagsHub!

Louis-Dupont commented 1 year ago

Hi @chiragubnare , You probably had logs before this exceptions looking like this:

Line `{line}` of file {label_file_path} will be ignored because not in LABEL_NORMALIZED_CXCYWH format: {e}

coco_detection_yolo_format_train expects your labels to be in the following format: label_id, cx, cw, w, h Can you please make sure that this is how your labels are stored ? If it is different, it would help if you can paste a few of these label files to help me understand what hapenned.

achbogga commented 1 year ago

@Louis-Dupont I am having the same issue. Here is a sample label file format:

1 0.318359375 0.31494140625 0.06640625 0.0869140625 
1 0.775 0.48486328125 0.059375 0.0732421875 
1 0.494140625 0.427734375 0.05859375 0.07421875 
1 0.11953125 0.4853515625 0.071875 0.091796875 
achbogga commented 1 year ago

@Louis-Dupont I don't have any other exceptions before this EmptyDataset Exception!

mqxi commented 1 year ago

Same issue as @achbogga here.

JINO-ROHIT commented 1 year ago

same issue, has this been resolved by someone?

JINO-ROHIT commented 1 year ago

@Louis-Dupont please help

Louis-Dupont commented 1 year ago

Can you set show_warnings=True and show me the logs ?

YoloDarknetFormatDetectionDataset(..., show_warnings=True)

This should will print a message showing for each file what line failed to be parsed, next to the file name.

Not that if you are loading the dataset with any of our dataloader function dataloaders.get, coco_detection_yolo_format_val or coco_detection_yolo_format_train you need to pass show_warnings=True to the dataset_params. E.g.

coco_detection_yolo_format_val(dataset_params={'show_warnings': True})
Louis-Dupont commented 1 year ago

This seems to be due to a parsing error. If it is publicly available, may I also ask which dataset/ where it comes from?

KChieza commented 4 months ago

Same issue, has anyone figured it out yet?

deanocko commented 4 months ago

I'm having the same issue.

To clarify, it should be: coco_detection_yolo_format_val(dataset_params={'show_all_warnings': True})

If it helps, this is a sample of the output that I get when I show all warnings. Total output is thousands of lines of much the same. Dataset is from Roboflow in Pytorch Yolov5 format. I suspect it's something to do with polygon bounding boxes. My one class that trains well uses rectangular BBs.

W0511 19:48:54.928956 137248678684480 yolo_format_detection.py:254] Line `1 0.1517570515625 0.5776502671875 0.1414924734375 0.60405866875 0.123830171875 0.742281865625 0.1359790140625 0.75529230625 0.17352987656250002 0.7645604328125 0.21237820625 0.7483375109375 0.24501161875 0.734521834375 0.4108853703125 0.6049726109375 0.5441012546875 0.5868025359375 0.7190118 0.55432885 0.8147552796875001 0.5162759875 0.9265679234374999 0.4331616953125 0.9344567140625 0.33540221406250004 0.844127903125 0.2360532203125 0.7092784625 0.2172423828125 0.6314975859375 0.2492585234375 0.49296032343749996 0.323484409375 0.329261815625 0.371933346875 0.24673667500000002 0.4326534296875 0.1818539078125 0.5027997 0.1517570515625 0.5776502671875` of file /home/dean/Documents/github/Capstone-DDRT/data/Skin-Defect-6/train/labels/IMG_20230512_125910_jpg.rf.c226ffdf56e807248b2d7b7e8f53365f.txt will be ignored because not cannot be parsed to (label, cx, cy, w, h) format, with Exception:
too many values to unpack (expected 5)
[2024-05-11 19:48:54] WARNING - yolo_format_detection.py - Line `1 0.3901198390625 0.588059521875 0.41860895468750003 0.629449746875 0.5763094203125 0.6890936796875 0.61036531875 0.684811746875 0.6337678062500001 0.6064539718749999 0.5774547859375 0.408429403125 0.516945421875 0.3810271890625 0.4491737359375 0.3810596671875 0.3724541859375 0.43338479531249996 0.3901198390625 0.588059521875
` of file /home/dean/Documents/github/Capstone-DDRT/data/Skin-Defect-6/train/labels/IMG_20230513_103728_jpg.rf.43f9584a55c50757480563d255b5d628.txt will be ignored because not cannot be parsed to (label, cx, cy, w, h) format, with Exception:
too many values to unpack (expected 5)
W0511 19:48:54.930835 137248678684480 yolo_format_detection.py:254] Line `1 0.3901198390625 0.588059521875 0.41860895468750003 0.629449746875 0.5763094203125 0.6890936796875 0.61036531875 0.684811746875 0.6337678062500001 0.6064539718749999 0.5774547859375 0.408429403125 0.516945421875 0.3810271890625 0.4491737359375 0.3810596671875 0.3724541859375 0.43338479531249996 0.3901198390625 0.588059521875
` of file /home/dean/Documents/github/Capstone-DDRT/data/Skin-Defect-6/train/labels/IMG_20230513_103728_jpg.rf.43f9584a55c50757480563d255b5d628.txt will be ignored because not cannot be parsed to (label, cx, cy, w, h) format, with Exception:
too many values to unpack (expected 5)
[2024-05-11 19:48:54] WARNING - yolo_format_detection.py - Line `4 0.744483234375 0.952987503125 0.780549775 0.98687486875 0.81474571875 1 0.8404389406250001 0.9974254265625 0.831341028125 0.972410484375 0.7972301125000001 0.9404773234375 0.7597881328125 0.9017504515624999 0.727782890625 0.8861005484375 0.6745758640624999 0.8826522656250001 0.5622049046875001 0.8878544640625 0.5639804734375 0.9421500390625 0.58078085 0.9467459453125 0.6467569640625 0.9461829609375 0.7087718062499999 0.9373051218749999 0.744483234375 0.952987503125` of file /home/dean/Documents/github/Capstone-DDRT/data/Skin-Defect-6/train/labels/IMG_20230513_103728_jpg.rf.43f9584a55c50757480563d255b5d628.txt will be ignored because not cannot be parsed to (label, cx, cy, w, h) format, with Exception:
too many values to unpack (expected 5)
W0511 19:48:54.931349 137248678684480 yolo_format_detection.py:254] Line `4 0.744483234375 0.952987503125 0.780549775 0.98687486875 0.81474571875 1 0.8404389406250001 0.9974254265625 0.831341028125 0.972410484375 0.7972301125000001 0.9404773234375 0.7597881328125 0.9017504515624999 0.727782890625 0.8861005484375 0.6745758640624999 0.8826522656250001 0.5622049046875001 0.8878544640625 0.5639804734375 0.9421500390625 0.58078085 0.9467459453125 0.6467569640625 0.9461829609375 0.7087718062499999 0.9373051218749999 0.744483234375 0.952987503125` of file /home/dean/Documents/github/Capstone-DDRT/data/Skin-Defect-6/train/labels/IMG_20230513_103728_jpg.rf.43f9584a55c50757480563d255b5d628.txt will be ignored because not cannot be parsed to (label, cx, cy, w, h) format, with Exception:
too many values to unpack (expected 5)
gwban commented 3 weeks ago

I ran into the same issue, but Iā€™ve managed to fix it.

The problem was caused by using labeling data that didnā€™t match the required format. After checking, I realized the data I was using was COCO segmentation label data. I ended up deleting that and converting the labels to YOLO format, which is suitable for detection. Once I made sure the images and labels matched up, everything worked fine.

I suggest you modify the YAML file as needed and run the program again, following the steps I outlined above.