ocean-data-factory-sweden / kso

Notebooks to upload/download marine footage, connect to a citizen science project, train machine learning models and publish marine biological observations.
GNU General Public License v3.0
4 stars 12 forks source link

Set up Tutorial 9 (Run ML on new footage) #177

Closed jannesgg closed 7 months ago

jannesgg commented 1 year ago

Add workflow that runs the model over a selection of footage, and finally aggregates this by site and returns the maximum count for a given species within the given movies.

We should check that it works for the template project as well as for active projects (e.g. Spyfish)

victor-wildlife commented 1 year ago

@pilarnavarro I am testing tutorial#10 on my end and got stack on the section to select the model. I tried with Spyfish and the template project but they are not found.

Image

Also, I got an issue with the itables package. I am afraid at the moment we don't have this package in the requirements. @jannesgg do you remember the reason for it? Can we add it to the list of requirements?

pilarnavarro commented 1 year ago

Hello @victor-wildlife. I am afraid it is an old version of tutorial 10. More precisely, it was just a copy of an old version of tutorial 6. Therefore, with the new changes to the kso-utils and tutorials, it was expected that this version was not going to work. I am sorry I didn't close this PR before.

victor-wildlife commented 1 year ago

Hi @pilarnavarro thanks for the update. That's all good. Just create a new PR when you have the tutorial updated

victor-wildlife commented 1 year ago

@pilarnavarro - We are dropping tut#9 (as it has merged in #tut#8) so this tutorial to "run machine learning models on footage" should be the new tut#9 (instead of tut#10). Just rename it to #9 when you are ready to PR.

pilarnavarro commented 1 year ago

@jannesgg While running tutorial 6 locally with the Spyfish Aotearoa project and the model in wandb trained using this data, I get the following error when calling the detect function of yolo:

Image

Steps to reproduce the error:

  1. In tutorial 6, choose Spyfish Aotearoa project and initiate database with the data in the AWS server:

Image

  1. Choose the model from wandb trained with the images from Spyfish Aotearoa and donwload it locally:

Image

  1. Choose some films for inference using the recently downloaded model, the folder to save the runs of wandb, the confidence threshold, and some name for the experiment:

Image

  1. Run the detect funcion of YOLOv5, and there you have the error:

Image

jannesgg commented 12 months ago

Hi @pilarnavarro. I have looked at this and tested it on Colab (since I do not have a GPU locally), but I am not able to reproduce the error above (see my screenshot attached). Perhaps there is some issue with the support for your GPU, but this is difficult for me to assess based on the error message alone. Another check could be to see that the txt file you create in step 3 (source_value) is not empty but that it contains the actual paths to the video streams. Also ensure that you have installed the requirements as detailed in yolov5/requirements.txt.

Screenshot 2023-08-30 at 14 02 47

What I would suggest is making sure that you have rebased your branch to the latest dev branch (which has quite a few changes) and run this again. If the error is still not resolved then, I would recommend trying to run your experiment on Colab for now. @Diewertje11, since you have a GPU on your computer, it might also be worth a shot for you to see if you can reproduce this issue locally when you have some time.

pilarnavarro commented 12 months ago

Okay, thank you @jannesgg. However, my branch was already up to date when I ran the tutorial, so I don't think it was the problem. Please @Diewertje11 let me know if it works for you. I will work in google colab for now then.

pilarnavarro commented 12 months ago

I tried to run the tutorial using a different web browser, and it magically worked without any issues. I have no idea what I did differently this time, but it works correctly now. However, in the next step, the following error appers: image Do you have any ideas about what could be causing this error?

victor-wildlife commented 11 months ago

Hello @pilarnavarro did you have any luck overcoming the error? Are you working on a local or remote Github branch? I tried looking for it in the kso_utils repo but couldn't find it

pilarnavarro commented 11 months ago

Hello @victor-wildlife

I haven't been able to solve the error above, but I managed to implement functions for computing and saving statistics, even with that error present. You can review the changes I made in the new tutorial_09 branch. I've tested everything, and it seems to be working well (except from the error above). Please take a look and let me know if it works for you or if you have any suggestions for improvement.

To complete this task entirely, I just need @jannesgg to provide me with instructions on how to access the mappings between class IDs and species names.

victor-wildlife commented 11 months ago

@pilarnavarro Thanks for the update.

I have formatted tut#9 in your branch to match the format we use in the other tutorials. Please let me know if there are any issues.

For mapping the species ids and names maybe you can use a similar approach to the one used in the "check_frames_uploaded" function in zooniverse_utils (screenshot attached).

Image

pilarnavarro commented 11 months ago

Thank you very much @victor-wildlife for formatting the tutorial correctly. I am sorry I didn't notice the changes in the tutorials format.

Regarding the approach used in that function to map the species IDs to their corresponding names, it doesn't seem to apply to our situation. The IDs in the database aren't the same as the IDs assigned by YOLO to each class. Below, you can find a snippet of code where I tried to follow that approach, along with the output it generates

Image

Image

One final question, do you think we should remove the section "Investigate training and validation datasets (only image data)" for this tutorial? Since we are working with footages, it might be more appropriate to exclude that section.

victor-wildlife commented 11 months ago

@pilarnavarro is there any way to link the labels of the annotations with the "commonName" from the database?

Yes, I agree, it will make sense to remove the investigate dataset section

pilarnavarro commented 11 months ago

That is exactly my question, hahahahah We need to figure out how to access the mapping of common names to class IDs that YOLO uses. I think there's a file containing that information, possibly conf.yaml, but I don't know how to access those files that YOLO generates.

victor-wildlife commented 11 months ago

@jannesgg any thoughts on this?

jannesgg commented 11 months ago

@pilarnavarro @victor-wildlife There is a yaml file that is generated when a YOLO dataset is created but the name of this yaml file is usually the project name together with a timestamp.

For example:

Spyfish_Aotearoa_12:46:57.yaml looks like this for an example I generated:

names: [snapper]
nc: 1
path: /home/jupyter-admin@cloudina.org-0f8df/kso/tutorials/ml-template-data/
train: train.txt
val: valid.txt

where the species in "names" will be the class_id according to yolo, i.e. snapper would have id 0.

In yolo_utils.py we use the following to map species_id to yolo_class_id:

sp_id2mod_id = {
        species_df[species_df.clean_label == species_list[i]].id.values[0]: i
        for i in range(len(species_list))
    }

Since you want to get back to species_id, I would propose just inverting the dictionary and then mapping those yolo_class_id values back to species_id, e.g.:

mod_id2sp_id = {v: k for k, v in sp_id2mod_id.items()}

Does this help?

victor-wildlife commented 10 months ago

@pilarnavarro Did Jannes's suggestion help?

pilarnavarro commented 10 months ago

Thank you so much @jannesgg for your help, and I am sorry for my delayed reply. I'm not entirely sure if I fully understand the idea that you explained, but first I need to understand how to access the YAML file you mentioned. I can't find it on my local computer where it is supposed to be. It's possible that I accidentally deleted it, but it's not in the trash either. I've also checked Wandb, but I cannot find it there. In any case, we need to find a general method for accessing this file for future use in the tutorial, not just for my specific model training.

I'm thinking that another (possibly simpler) approach might be to use the "conf.yaml" file in the "Files" section of Wandb of the specific run we want to use (in our case, "yolo_labels" from the "spyfish_aotearoa" project). This file seems to provide a direct mapping from class IDs to common names. However, I'm unsure of how to access that file in WandB either.

Please let me know your thoughts on this idea. Thank you!

pilarnavarro commented 10 months ago

Btw, do you think I should consider correcting the data annotations using tutorial 8? I've been reviewing the annotations and, in my opinion, they don't seem to be of high quality. I'm thinking of investing some time to improve the annotations. However, I'm unsure whether my annotations (coming from a single person) will be as robust as those gathered from various people. What are your thoughts on this?

I also took a look at the model's predictions on some footage, and they appear to be quite inaccurate. I couldn't find a single correct prediction.

victor-wildlife commented 10 months ago

Hello Pilar, at the moment I will say the priority should be on setting the pipeline up rather than the accuracy of the models. @jannesgg had a look at the W&B page with the model results and mentioned there was a "legend" of what label1,2,3, meant. @jannesgg could you please elaborate on how to access it?

jannesgg commented 10 months ago

@pilarnavarro Hi Pilar, apologises for the delay.

Using the wandb API it should be possible to extract this information from wandb in the following way:

import wandb
api = wandb.Api()
run = api.run("spyfish_aotearoa/3modzaj1")

In [9]: run.rawconfig['data_dict']
Out[9]:
{'nc': 5,
 'val': '/home/pilarnavarro/pilar/wildlife.ai/koster_data_management/data/spyfish/valid.txt',
 'path': '/home/pilarnavarro/pilar/wildlife.ai/koster_data_management/data/spyfish/',
 'names': {'0': 'bait',
  '1': 'blue_cod',
  '2': 'other',
  '3': 'scarlet_wrasse',
  '4': 'snapper'},
 'train': '/home/pilarnavarro/pilar/wildlife.ai/koster_data_management/data/spyfish/train.txt'}
pilarnavarro commented 10 months ago

Thank you @jannesgg for the information. I'll try it and let you know once I've tested it. Regarding the argument of the api.run function, I assume it is the project name and the run ID, but how can I access a particular run ID? I would like to make the functionality more generalized, rather than hardcoding it to a specific run.

jannesgg commented 10 months ago

@pilarnavarro It is possible to list the runs for a specific project using the Api as well, here is the link to the full documentation https://docs.wandb.ai/ref/python/public-api/api. We also have a similar method in the MLProjectProcessor class in project.py called choose_model which returns the run ids as well, I believe? Hope this helps :)

pilarnavarro commented 10 months ago

I found this new error while running the tutorial again, any ideas? image image

victor-wildlife commented 10 months ago

@pilarnavarro Yes, there were some updates in the movies.csv file and the movies couldn't be found. It should be working now

pilarnavarro commented 9 months ago

Thank you @victor-wildlife for the information. I've tried it again, and the same error persists. However, I was able to finish the implementation, since the error doesn't affect the final cells, and Tutorial 9 is ready from my end. Please review it and suggest any improvements if needed.