stark-t / PAI

Pollination_Artificial_Intelligence
5 stars 1 forks source link

Error in `utils_create_datasets.py` for dev_cluster branch - most probably a path issue #27

Closed valentinitnelav closed 2 years ago

valentinitnelav commented 2 years ago

Hi @stark-t , I tried to implement the most recent version from the utils_create_datasets.py based on these edits 19131b61f0ca1e89e601f99982832e8255783f21 , but I got this error (see at the bottom). For now, I defaulted to simply replacing \\ with a / in all the utils_* scripts in the dev_cluster branch and that seems to work. When it works, the console displays something like this. Maybe is easier/faster for you to find the issue. I will run now 10 epochs with a nano model and see if we get that anomaly again from #26

(yolov5) [sv127qyji@galaxy138 scripts]$ python ~/PAI/scripts/utils_create_datasets.py
Original dataset
/home/sc.uni-leipzig.de/sv127qyji/PAI/scripts/utils_datapaths.py:63: FutureWarning: Indexing with multiple keys (implicitly converted to a tuple of keys) will be deprecated, use a list instead.
  print_df = df.groupby(['class'])['images_path', 'labels_path'].count()
                        images_path  labels_path
class                                           
araneae                        1855         1523
coleoptera                     2490         2336
diptera                        2807         2401
hemiptera                      1991         1711
hymenoptera                    2994         2461
hymenoptera_formicidae         1474         1051
lepidoptera                    5102         4576
orthoptera                     1792         1649

Number of image tiles per class in 20.0% valdiation dataset
/home/sc.uni-leipzig.de/sv127qyji/PAI/scripts/utils_datasampling.py:46: FutureWarning: Indexing with multiple keys (implicitly converted to a tuple of keys) will be deprecated, use a list instead.
  print_df = df_test.groupby(['class'])['images_path', 'labels_path'].count()
                        images_path  labels_path
class                                           
araneae                         210          210
coleoptera                      210          210
diptera                         210          210
hemiptera                       210          210
hymenoptera                     210          210
hymenoptera_formicidae          210          210
lepidoptera                     210          210
orthoptera                      210          210

Number of image tiles per class in 20.0% valdiation dataset
/home/sc.uni-leipzig.de/sv127qyji/PAI/scripts/utils_datasampling.py:58: FutureWarning: Indexing with multiple keys (implicitly converted to a tuple of keys) will be deprecated, use a list instead.
  print_df = df_val.groupby(['class'])['images_path', 'labels_path'].count()
                        images_path  labels_path
class                                           
araneae                         210          210
coleoptera                      210          210
diptera                         210          210
hemiptera                       210          210
hymenoptera                     210          210
hymenoptera_formicidae          210          210
lepidoptera                     210          210
orthoptera                      210          210

Number of image tiles per class in training dataset
/home/sc.uni-leipzig.de/sv127qyji/PAI/scripts/utils_datasampling.py:66: FutureWarning: Indexing with multiple keys (implicitly converted to a tuple of keys) will be deprecated, use a list instead.
  print_df = df_train.groupby(['class'])['images_path', 'labels_path'].count()
                        images_path  labels_path
class                                           
araneae                        1103         1103
coleoptera                     1916         1916
diptera                        1981         1981
hemiptera                      1291         1291
hymenoptera                    2041         2041
hymenoptera_formicidae          631          631
lepidoptera                    4156         4156
orthoptera                     1229         1229
0it [00:00, ?it/s]
Traceback (most recent call last):
  File "/home/sc.uni-leipzig.de/sv127qyji/PAI/scripts/utils_create_datasets.py", line 98, in <module>
    run_create_datasets()
  File "/home/sc.uni-leipzig.de/sv127qyji/PAI/scripts/utils_create_datasets.py", line 93, in run_create_datasets
    create_dataset_func(df=df_train, data_path=train_PATH)
  File "/home/sc.uni-leipzig.de/sv127qyji/PAI/scripts/utils_create_datasets.py", line 48, in create_dataset_func
    with open(label_PATH_dst, 'w') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/home/sc.uni-leipzig.de/sv127qyji/datasets/P1_Data_sampled/train/labels/Coleoptera_Curculionidae_Lixus_cylindrus_3455265584_963147.txt'
valentinitnelav commented 2 years ago

Fixed by @stark-t with commit a864bae7255694325b5266bfe461f2b9ea5ae8da