Closed Chiauwen closed 1 year ago
You need to export Yolo-NAS model to ONNX which you can later measure using trtexec
.
Here are some docs you may find useful:
from super_gradients.training import Trainer from super_gradients.common.object_names import Models from super_gradients.training import models from super_gradients.training.processing import ComposeProcessing from torch.utils.data import DataLoader
trainer = Trainer(experiment_name="yolo_nas_s_soccer_players", ckpt_root_dir="/content/sg_checkpoints_dir/") net = models.get(Models.YOLO_NAS_S, num_classes=4, pretrained_weights="coco")
trainer.train(model=net, training_params=train_params, train_loader=train_loader, valid_loader=valid_loader)
run_id=RUN_20240111_174419_954854
[2024-01-11 17:44:19] INFO - sg_trainer.py - Checkpoints directory: /content/sg_checkpoints_dir/yolo_nas_s_soccer_players/RUN_20240111_174419_954854
[2024-01-11 17:44:19] INFO - sg_trainer.py - Using EMA with params {'decay': 0.9, 'decay_type': 'threshold'}
The console stream is now moved to /content/sg_checkpoints_dir/yolo_nas_s_soccer_players/RUN_20240111_174419_954854/console_Jan11_17_44_19.txt
/usr/local/lib/python3.10/dist-packages/super_gradients/common/registry/registry.py:72: DeprecationWarning: Object name linear_epoch_step
is now deprecated. Please replace it with LinearEpochLRWarmup
.
warnings.warn(f"Object name {name}
is now deprecated. Please replace it with {deprecated_names[name]}
.", DeprecationWarning)RuntimeError Traceback (most recent call last)
......
/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/collate.py in default_collate(batch)
136 tensor([0, 1, 2, 3])
137 >>> # Example with a batch of str
s:
--> 138 >>> default_collate(['a', 'b', 'c'])
139 ['a', 'b', 'c']
140 >>> # Example with Map
inside the batch:
RuntimeError: stack expects each tensor to be equal size, but got [18, 5] at entry 0 and [19, 5] at entry 1
I am getting this error whwn I am trying to train the yolonas model coco format with roboflow's soccer-players-2 dataset. I am stuck at this error. Please get back soon if there is a solution for this.
@BloodAxe
How you create your dataloaders?
Through the roboflow website.
On Thu, Jan 11, 2024 at 12:28 PM Eugene Khvedchenya < @.***> wrote:
How you create your dataloaders?
— Reply to this email directly, view it on GitHub https://github.com/Deci-AI/super-gradients/issues/1351#issuecomment-1887913142, or unsubscribe https://github.com/notifications/unsubscribe-auth/BFI4UILIRPYSWBFD7NNKZATYOBDP7AVCNFSM6AAAAAA3GYPX5OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBXHEYTGMJUGI . You are receiving this because you commented.Message ID: @.***>
DataLoader not dataset. What collate function are you using? Please follow tutorials and use the right collate and loss function as show in YoloNAS dataset params: https://github.com/Deci-AI/super-gradients/blob/5fe4f4ca94d5e556608b7160558cdbf18cbcc2c5/src/super_gradients/recipes/dataset_params/coco_detection_yolo_nas_dataset_params.yaml
See train_dataloader_params/val_dataloader_params collate function
I used the right collate function mentioned in the yolonas dataset params.
I am getting this new error now.
TypeError Traceback (most recent call last) Cell In[21], line 8 6 trainer = Trainer(experiment_name="yolo_nas_s_100", ckpt_root_dir="sg_checkpoints_dir/") 7 net = models.get(Models.YOLO_NAS_S, num_classes=1, pretrained_weights="coco") ----> 8 trainer.train(model=net, training_params=train_params, train_loader=train_loader, valid_loader=valid_loader)
File ~/anaconda3/envs/yolonas0/lib/python3.10/site-packages/super_gradients/training/sg_trainer/sg_trainer.py:1472, in Trainer.train(self, model, training_params, train_loader, valid_loader, test_loaders, additional_configs_to_log) 1465 raise ValueError( 1466 "You can use sliding window validation callback, but your model does not support sliding window " 1467 "inference. Please either remove the callback or use the model that supports sliding inference: " 1468 "Segformer" 1469 ) 1471 if isinstance(model, SupportsInputShapeCheck): -> 1472 first_train_batch = next(iter(self.trainloader)) 1473 inputs, , _ = sg_trainer_utils.unpack_batch_items(first_train_batch) 1474 model.validate_input_shape(inputs.size())
File ~/anaconda3/envs/yolonas0/lib/python3.10/site-packages/torch/utils/data/dataloader.py:530, in _BaseDataLoaderIter.next(self) 528 if self._sampler_iter is None: 529 self._reset() --> 530 data = self._next_data() 531 self._num_yielded += 1 532 if self._dataset_kind == _DatasetKind.Iterable and \ 533 self._IterableDataset_len_called is not None and \ 534 self._num_yielded > self._IterableDataset_len_called:
File ~/anaconda3/envs/yolonas0/lib/python3.10/site-packages/torch/utils/data/dataloader.py:570, in _SingleProcessDataLoaderIter._next_data(self) 568 def _next_data(self): 569 index = self._next_index() # may raise StopIteration --> 570 data = self._dataset_fetcher.fetch(index) # may raise StopIteration 571 if self._pin_memory: 572 data = _utils.pin_memory.pin_memory(data)
File ~/anaconda3/envs/yolonas0/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py:52, in _MapDatasetFetcher.fetch(self, possibly_batched_index) 50 else: 51 data = self.dataset[possibly_batched_index] ---> 52 return self.collate_fn(data)
Cell In[17], line 12, in DetectionCollateFN.call(self, data) 11 def call(self, data) -> Tuple[torch.Tensor, torch.Tensor]: ---> 12 data = [sample for sample in data if sample[0].size(0) > 0] 14 if not data: 15 # If there are no valid samples, return empty tensors 16 return torch.zeros(0), torch.zeros((0, 5))
Cell In[17], line 12, in
TypeError: 'int' object is not callable
Please provide a minimal example that can reproduce your issue in the form of google colab notebook. it is impossible to help you just by looking at sparse code snippets that you have shown
PLease refer this script.Thanks.
https://github.com/madmon24/YoloNAS/blob/main/DeciYoloCustomDatasetQAFineTuning_madmon.ipynb
💡 Your Question
Previously I used the code below to get the latency on my custom trained model with YOLOv5:
python val.py --weights best.pt --data data.yaml --task test
But of course it's not working on YOLO-NAS, so how to generate the latency value on the custom train model with YOLO-NAS?
Thank you.
Versions
No response