Can demo how to use the yolov5 repo to train a model?

KingWu commented 2 years ago

I pull the yolov5 repo and download the coco dataset to ran the following command The recipe download from here

python train.py   --cfg yolov5s.yaml     --weights yolov5s.pt     --data coco.yaml     --hyp data/hyps/hyp.scratch.yaml     --recipe recipes/yolov5s.pruned_quantized.md

but throws the following errors

AutoAnchor: 4.45 anchors/target, 0.995 Best Possible Recall (BPR). Current anchors are a good fit to dataset ✅
Image sizes 640 train, 640 val
Using 8 dataloader workers
Logging results to yolov5_runs/train/exp2
Starting training for 300 epochs...
Traceback (most recent call last):
  File "train.py", line 761, in <module>
    main(opt)
  File "train.py", line 657, in main
    train(opt.hyp, opt, device, callbacks)
  File "train.py", line 346, in train
    grid_size=gs
  File "/Users/ycy101/King/PlakerLab/project/yolov5-master-neuralmagic/utils/sparse.py", line 134, in initialize
    self.manager.initialize(self.model, start_epoch, grad_sampler=grad_sampler)
  File "/Users/ycy101/.pyenv/versions/3.7.11/lib/python3.7/site-packages/sparseml/pytorch/optim/manager.py", line 466, in initialize
    self.iter_modifiers(), module, epoch, loggers, **kwargs
  File "/Users/ycy101/.pyenv/versions/3.7.11/lib/python3.7/site-packages/sparseml/pytorch/optim/manager.py", line 691, in _initialize_modifiers
    mod.initialize(module, epoch, loggers, **kwargs)
  File "/Users/ycy101/.pyenv/versions/3.7.11/lib/python3.7/site-packages/sparseml/pytorch/sparsification/pruning/modifier_pruning_base.py", line 294, in initialize
    named_layers_and_params = self._create_named_layers_and_params(module)
  File "/Users/ycy101/.pyenv/versions/3.7.11/lib/python3.7/site-packages/sparseml/pytorch/sparsification/pruning/modifier_pruning_base.py", line 774, in _create_named_layers_and_params
    return super()._create_named_layers_and_params(module)
  File "/Users/ycy101/.pyenv/versions/3.7.11/lib/python3.7/site-packages/sparseml/pytorch/sparsification/pruning/modifier_pruning_base.py", line 532, in _create_named_layers_and_params
    params_strict=True,
  File "/Users/ycy101/.pyenv/versions/3.7.11/lib/python3.7/site-packages/sparseml/pytorch/utils/helpers.py", line 902, in get_named_layers_and_params_by_regex
    validate_all_params_found(param_names, found_param_names)
  File "/Users/ycy101/.pyenv/versions/3.7.11/lib/python3.7/site-packages/sparseml/pytorch/utils/helpers.py", line 950, in validate_all_params_found
    name_or_regex, found_param_names, name_or_regex_patterns
RuntimeError: All supplied parameter names or regex patterns not found.No match for model.9.m.0.cv2.conv.weight in found parameters ['model.7.conv.weight', 'model.8.cv2.conv.weight', 'model.13.m.0.cv2.conv.weight', 'model.17.cv2.conv.weight', 'model.18.conv.weight', 'model.20.cv2.conv.weight', 'model.20.cv3.conv.weight', 'model.20.m.0.cv2.conv.weight', 'model.21.conv.weight', 'model.23.cv2.conv.weight', 'model.23.cv3.conv.weight', 'model.23.m.0.cv2.conv.weight']. 
Supplied ['model.23.m.0.cv2.conv.weight', 'model.21.conv.weight', 'model.23.cv3.conv.weight', 'model.23.cv2.conv.weight', 'model.20.m.0.cv2.conv.weight', 'model.18.conv.weight', 'model.9.m.0.cv2.conv.weight', 'model.7.conv.weight', 'model.20.cv3.conv.weight', 'model.20.cv2.conv.weight', 'model.8.cv2.conv.weight', 'model.13.m.0.cv2.conv.weight', 'model.17.cv2.conv.weight']

dnth commented 2 years ago

Have you tried this Colab notebook? By far it's the easiest way to train

https://colab.research.google.com/github/dnth/yolov5-deepsparse-blogpost/blob/master/notebooks/deepsparse_blogpost.ipynb

KingWu commented 2 years ago

@dnth Yes. Tried Colab notebook. Just wanna to export other format. Open the notebook, and try to run the following script.

!cd yolov5-deepsparse-blogpost/yolov5-train/ && python export.py --weights zoo:cv/detection/yolov5-l/pytorch/ultralytics/coco/pruned_quant-aggressive_95 --include torchscript --img 640 --optimize

Found the error

Traceback (most recent call last):
  File "export.py", line 715, in <module>
    main(opt)
  File "export.py", line 704, in main
    run(**vars(opt))
  File "/usr/local/lib/python3.7/dist-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
    return func(*args, **kwargs)
  File "export.py", line 593, in run
    model, extras = load_checkpoint(type_='ensemble', weights=weights, device=device)  # load FP32 model
  File "export.py", line 529, in load_checkpoint
    state_dict = load_state_dict(model, state_dict, run_mode=not ensemble_type, exclude_anchors=exclude_anchors)
  File "export.py", line 553, in load_state_dict
    model.load_state_dict(state_dict, strict=not run_mode)  # load
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1407, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for Model:
    Unexpected key(s) in state_dict: "model.24.anchor_grid".

Any idea?

dnth commented 2 years ago

Ahh i see.. I think the export to torchscript is broken at the moment. You can ask in the Neural Magic slack group and I think there's better chance of getting answers there. If you'd like to use torchscript I suggest you use the original yolov5 repo and not the forked version by Neural Magic.

KingWu commented 2 years ago

@dnth Where can i find the slack group?

dnth commented 2 years ago

@KingWu here https://join.slack.com/t/discuss-neuralmagic/shared_invite/zt-q1a1cnvo-YBoICSIw3L1dmQpjBeDurQ

dnth / yolov5-deepsparse-blogpost

Can demo how to use the yolov5 repo to train a model? #8