Open cbrxyz opened 2 months ago
I think a good plan for getting training data for this could include:
At that point, it can be added to our Label Studio!
So far, I have used yt-dlp and ffmpeg to extract JPEGs of frames showing the light tower from the 2018 and 2022 RobotX finals competition videos on YouTube. I created a Label Studio project with these images (http://10.245.80.197:8431/projects/16) and am labeling them as containing a red, blue, green, or black light tower.
@alexoj46
Here are the results of your trained model!
It's not bad, but it could be a little better (it would be good to see mAP@0.5
above 0.75)! I think a good next step would be changing some of the training hyperparameters to try to encourage better learning. If that still doesn't work, we can then get more data, add regularized data, etc. Adjusting the parameters should be an easy first modification for trying to change the performance of the model.
Can you send me the training command you used over Discord?
@alexoj46
It looks like another reason that the training might have failed is because of unbalanced data. The class distribution is the following (found by searching for "Annotation results contain 'your_color Light Tower'" in label studio):
If you're having trouble finding some more data, just let us know and we can try to help! Mechanical has the structure of the real-life light tower ready, but we're working with electrical now to develop the changing color panel of the light tower itself. Hopefully it will be done by mid next week at the latest!
Blocked by uf-mil-electrical/NaviGator#1 and uf-mil-mechanical/tasks#11
This week, I finished labeling images in label studio and used yolov7-tiny to train a model based on this data. This required first modifying a script provided by Daniel to split images into training, testing, and validation folders, then modifying and running relevant training commands and scripts in the yolov7 directory. Because the provided data spread was not regular (for example, only 8/250 images of a green tower), the results were not as great as I’d hoped (see results above). While I added a few more images from youtube videos I could find, there are not enough clips available to regularize the data, so I am now waiting for the mechanical light tower to be constructed to gather more images. In the meantime, I installed and setup Ubuntu through UTM to be able to access the simulation, etc. in the future for testing, and I will continue to research the next steps for testing and validation of a trained CV model to prepare for when we have more data.
This week while waiting for the mechanical tower to gather more image data, I have setup ubuntu desktop as well as added the GitHub repository and relevant developer tools including pre-commit and neovim. I was able to successfully access the simulator in this way. I attended today’s testing session, and was able to connect to and move the boat through ssh and tmux for the first time on my laptop, as well as view the nviz visualizers through ubuntu. I also learned how to collect bagged data from the boat’s camera, in preparation for gathering bags of data on the LED tower, once constructed.
@alexoj46 let's try to get some more data for this model tomorrow! we can get some data from the boat, or the shore!
What needs to change?
We will need to create a computer vision model for detecting the different phases of the light tower object. Previously we attempted this with a classical model of detecting the color of the tower, although we experienced variable success with this approach to the problem.
How would this task be tested?