stark-t / PAI

Pollination_Artificial_Intelligence
5 stars 1 forks source link

Class name order in `config_yolov5.yaml` #26

Closed valentinitnelav closed 2 years ago

valentinitnelav commented 2 years ago

Hi @stark-t , I managed to run a nano and small YOLOv5 model over the weekend on the Clara cluster, but I got an anomaly in the results. I checked what might have caused this and I think is the order of the class names in the scripts/config_yolov5.yaml.

The current order is with 'Hymenoptera formicidae' before 'Hymenoptera' and it should be after:

names: [ 'Araneae', 'Coleoptera', 'Diptera', 'Hemiptera', 'Hymenoptera formicidae', 'Hymenoptera', 'Lepidoptera', 'Orthoptera' ]

I think it should be like this:

names: [ 'Araneae', 'Coleoptera', 'Diptera', 'Hemiptera', 'Hymenoptera', 'Hymenoptera formicidae', 'Lepidoptera', 'Orthoptera' ]

I made the corrections and send the new jobs to the cluster. I will get the results sometime today after lunch I think.

Here are some results regarding the "anomaly":

confusion_matrix results

valentinitnelav commented 2 years ago

I modified the YAML file as suggested and still get the anomaly. Not sure what the problem could be this time. Any ideas?

confusion_matrix

stark-t commented 2 years ago

@valentinitnelav Ok this looks really strange. I will have a closer look tomorrow morning, so we can hopefully already have a call before lunch time. I'll let you know.

valentinitnelav commented 2 years ago

@stark-t , thanks for looking into this. It could be that the labels are mixed. Here are some of the images in yolov5/runs/train:

labels.jpg

labels

Hymenoptera Formicidae should not be that huge in the number of boxes (instances).

train_batch0.jpg

I think there is something odd with the training dataset. the label indices do not match the reality. For example, 0 must be Arenae, not Coleoptera.

train_batch0

val_batch0_labels.jpg

The validation dataset looks ok-ish, with the exception that Ants should not be Hymenoptera, but Hymenoptera Formicidae.

val_batch0_labels

val_batch0_pred.jpg

val_batch0_pred

valentinitnelav commented 2 years ago

Hi @stark-t , so it is indeed a problem with the label id-s. Not sure how that happens, but for example, when I created the folder P1_Data_sampled with utils_create_datasets.py, I get wrong labels:

nano ~/datasets/P1_Data_sampled/train/labels/Araneae_Agelenidae_Agelena_orientalis_2429400616_2340718.txt
# Contains
5 0.514227642276423 0.455808080808081 0.585365853658537 0.611111111111111
# Label ID should be 0, not 5; the coordinates are ok

nano ~/datasets/P1_Data_sampled/train/labels/Coleoptera_Aphodiidae_Aphodius_coniugatus_2529406019_676427.txt
0 0.464599609375 0.509530791788856 0.21728515625 0.306451612903226
# Label ID should be 1, not 0; the coordinates are ok

nano ~/datasets/P1_Data_sampled/train/labels/Diptera_Anthomyiidae_Alliopsis_silvestris_11215593.txt
3 0.492378048780488 0.554878048780488 0.963414634146341 0.780487804878049
# Label ID should be 2, not 3; the coordinates are ok

nano ~/datasets/P1_Data_sampled/train/labels/Hemiptera_Acanaloniidae_Acanalonia_conica_2447835932_1506480.txt
6 0.486156351791531 0.475 0.327361563517915 0.174390243902439
# Label ID should be 3, not 6; the coordinates are ok

nano ~/datasets/P1_Data_sampled/train/labels/Hymenoptera_Formicidae_Amblyopone_denticulata_891778942_2769093.txt
1 0.524544953116382 0.52972972972973 0.568119139547711 0.477220077220077
# Label ID should be 4, not 1; the coordinates are ok

nano ~/datasets/P1_Data_sampled/train/labels/Hymenoptera_Andrenidae_Andrena_aegyptiaca_2999189599_1119060.txt
2 0.486572265625 0.490842490842491 0.17333984375 0.186080586080586
# Label ID should be 5, not 2; the coordinates are ok

nano ~/datasets/P1_Data_sampled/train/labels/Lepidoptera_Adelidae_Adela_albicinctella_3067600169_2738625.txt
4 0.513493253373313 0.4255 0.949025487256372 0.603
# Label ID should be 6, not 4; the coordinates are ok

nano ~/datasets/P1_Data_sampled/train/labels/Orthoptera_Trigonidiidae_Trigonidium_cicindeloides_3355089148_89705.txt
# Contains
7 0.463533225283631 0.508508914100486 0.567260940032415 0.598865478119935
# Label ID is correct for this one
stark-t commented 2 years ago

@valentinitnelav ok so the problem seems to be in utils_create_dataset

    CLASSES = df['class'].unique().tolist()
    for i, row in tqdm.tqdm(df.iterrows()):
        label_PATH_src = row['labels_path']
        image_PATH_src = row['images_path']
        file_name = row['file_names']
        class_name = row['class']
        class_id = CLASSES.index(class_name)

I will try to change this, by using the class list from the yaml file.

stark-t commented 2 years ago

Current commit should hopefully fix this. Class names are used directly from the yaml file

valentinitnelav commented 2 years ago

Thanks @stark-t , I think it works now. First I didn't realize that the class names in the YAML file need to have the exact order in which img_* folders are read from P1_Data and that they also need to be title cases. I read more carefully your code now.

I will run a nano model to run for 10 epochs and see if all is ok. Then I run a nano & small for 300 epochs.

cd ~/datasets/P1_Data_sampled/train/labels

head -n1 \
Araneae_Agelenidae_Agelena_orientalis_2429400616_2340718.txt \
Coleoptera_Aphodiidae_Aphodius_coniugatus_2529406019_676427.txt \
Diptera_Anthomyiidae_Alliopsis_silvestris_11215593.txt \
Hemiptera_Acanaloniidae_Acanalonia_conica_2447835932_1506480.txt \
Hymenoptera_Formicidae_Amblyopone_denticulata_891778942_2769093.txt \
Hymenoptera_Andrenidae_Andrena_aegyptiaca_2999189599_1119060.txt \
Lepidoptera_Adelidae_Adela_albicinctella_3067600169_2738625.txt \
Orthoptera_Trigonidiidae_Trigonidium_cicindeloides_3355089148_89705.txt
==> Araneae_Agelenidae_Agelena_orientalis_2429400616_2340718.txt <==
0 0.514227642276423 0.455808080808081 0.585365853658537 0.611111111111111

==> Coleoptera_Aphodiidae_Aphodius_coniugatus_2529406019_676427.txt <==
1 0.464599609375 0.509530791788856 0.21728515625 0.306451612903226

==> Diptera_Anthomyiidae_Alliopsis_silvestris_11215593.txt <==
2 0.492378048780488 0.554878048780488 0.963414634146341 0.780487804878049

==> Hemiptera_Acanaloniidae_Acanalonia_conica_2447835932_1506480.txt <==
3 0.486156351791531 0.475 0.327361563517915 0.174390243902439

==> Hymenoptera_Formicidae_Amblyopone_denticulata_891778942_2769093.txt <==
4 0.524544953116382 0.52972972972973 0.568119139547711 0.477220077220077

==> Hymenoptera_Andrenidae_Andrena_aegyptiaca_2999189599_1119060.txt <==
5 0.486572265625 0.490842490842491 0.17333984375 0.186080586080586

==> Lepidoptera_Adelidae_Adela_albicinctella_3067600169_2738625.txt <==
6 0.513493253373313 0.4255 0.949025487256372 0.603

==> Orthoptera_Trigonidiidae_Trigonidium_cicindeloides_3355089148_89705.txt <==
7 0.463533225283631 0.508508914100486 0.567260940032415 0.598865478119935
valentinitnelav commented 2 years ago

The 10 epochs test run looks like this so far. I think we are on the right path now. I let a 300 epochs nano and small weights to run on the Clara cluster. Tomorrow we should have the first results.

confusion_matrix.png

confusion_matrix

labels.jpg

labels

results.png

results

train_batch0.jpg

train_batch0

val_batch0_labels.jpg

val_batch0_labels

val_batch0_pred.jpg

val_batch0_pred

valentinitnelav commented 2 years ago

Fixed with commit a864bae7255694325b5266bfe461f2b9ea5ae8da