ocean-data-factory-sweden / kso

Notebooks to upload/download marine footage, connect to a citizen science project, train machine learning models and publish marine biological observations.
GNU General Public License v3.0
4 stars 12 forks source link

Notebook 5 / training models gives error: run() got unexpected keyword argument batch_size #187

Closed Diewertje11 closed 1 year ago

Diewertje11 commented 1 year ago

This issue is already solved and this is just a description of the error for documentation, since the error is not very clear and it was hard to find its cause.

Description of the error When training a model in Notebook 5, with

mlp.train_yolov5( exp_name.value, weights.artifact_path, epochs=epochs.value, batch_size=batch_size.value, img_size=(img_h.value, img_w.value), )

which indirectly calls

yolov5.train()

you get the error from the image.

Image

This error complains about the batch_size argument, while the code that is printed shows that the batch_size argument exist. And also if you go into the val.py file from the yolov5 repository, this batch_size argument exists.

Trace back of the origin of the error and solution This error occurs since the commit where the project.py was created in commit 7c0d287. The error only occurs when yolov5.train(epochs=1) (or our code) is run after that the MLProjectProcessor class is created. Without this class, the code runs properly. In this class, t6_utils gets imported, which on its turn does: import yolov5_tracker.track as track. In that code, some paths are appended to the sys.path. As a result of this, when the validation.run() is called in the tolov5.train(), it does not run the val.py from yolov5, but it runs the val.py from the tracker.

This val.py from the tracker is never imported in the entire repository, so it is just not used. Therefore the solution to this problem is to delete the val.py in the tracker repository. This makes it possible to run Notebook 5 / train models without errors.

Remaining issue One remaining issue is that you cannot train 2 models in the same notebook session. For now this will be mentioned in the notebook and people are instructed to simply restart the notebook.

Diewertje11 commented 1 year ago

This solution is implemented in commit 800f35b