Closed santoshmedisetty closed 2 years ago
Hi @santoshmedisetty I wasnt able to reproduce the error above. Are you training using the YOLOv5 repo from Ultralytics?
I've added commands to train a YOLOv5-Nano in my Colab notebook. Check it out to see if it works for your dataset.
Hi @dnth, No, I'm using the Yolov5 repo from your repository. I got the error with 'recipes/yolov5.transfer_learn_pruned_quantized.md' recipe. When I changed the recipe to 'recipes/yolov5.transfer_learn_pruned.md', I did not get any error.
My training command almost looks like yours. Did I miss anything?
That's strange. Have you tried with the Colab notebook? Does it give the same error?
I did not get error with the Colab notebook. This might be an error due to some package. I'll check
I did not get error with the Colab notebook. This might be an error due to some package. I'll check
Keep me updated here :)
Hi @dnth, There seemed to be some issue with some packages. When I reinstalled all the requirements, it worked fine. Thank you
Hi, I trained a yolov5-nano model with pruned and quantized recipe on my custom data. As soon as the last epoch is completed, I get the below error. Is this something to do with any package installation? I did not get any error with 'yolov5.transfer_learn_pruned_quantized.md' recipe
I'm using Pytorch 1.9.0
Below is my training command. python3 train.py --cfg ./models_v5.0/yolov5n.yaml --data ../aris_and_video_data3/data.yaml --hyp data/hyps/hyp.scratch.yaml --weights yolov5n.pt --img 640 --batch-size 16 --optimizer SGD --recipe ../recipes/yolov5.transfer_learn_pruned_quantized.md --project yolov5-deepsparse --name yolov5n-sgd-pruned-quantized3 --device 0
Below is the error message after the last epoch.
Traceback (most recent call last): File "train.py", line 745, in
main(opt)
File "train.py", line 641, in main
train(opt.hyp, opt, device, callbacks)
File "train.py", line 514, in train
model=loadcheckpoint(type='ensemble', weights=best, device=device)[0],
File "/home/santosh/deepsparse_fishcount/fish-video-count-pipeline-PROD/yolov5_deepsparse_blogpost/yolov5_train/export.py", line 529, in load_checkpoint
state_dict = load_state_dict(model, state_dict, run_mode=not ensemble_type, exclude_anchors=exclude_anchors)
File "/home/santosh/deepsparse_fishcount/fish-video-count-pipeline-PROD/yolov5_deepsparse_blogpost/yolov5_train/export.py", line 553, in load_state_dict
model.load_state_dict(state_dict, strict=not run_mode) # load
File "/home/santosh/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1406, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Model:
Missing key(s) in state_dict: "model.0.conv.quant.activation_post_process.scale", "model.0.conv.quant.activation_post_process.zero_point", "model.0.conv.quant.activation_post_process.fake_quant_enabled", "model.0.conv.quant.activation_post_process.observer_enabled", "model.0.conv.quant.activation_post_process.scale", "model.0.conv.quant.activation_post_process.zero_point", "model.0.conv.quant.activation_post_process.activation_post_process.min_val", "model.0.conv.quant.activation_post_process.activation_post_process.max_val", "model.0.conv.module.weight", "model.0.conv.module.bias", "model.0.conv.module.weight_fake_quant.scale", "model.0.conv.module.weight_fake_quant.zero_point", "model.0.conv.module.weight_fake_quant.fake_quant_enabled", "model.0.conv.module.weight_fake_quant.observer_enabled", "model.0.conv.module.weight_fake_quant.scale", "model.0.conv.module.weight_fake_quant.zero_point", "model.0.conv.module.weight_fake_quant.activation_post_process.min_val", "model.0.conv.module.weight_fake_quant.activation_post_process.max_val", "model.0.conv.module.activation_post_process.scale", "model.0.conv.module.activation_post_process.zero_point", "model.0.conv.module.activation_post_process.fake_quant_enabled", .....
I was able to paste only a portion of the error.