marcoslucianops / DeepStream-Yolo

NVIDIA DeepStream SDK 7.1 / 7.0 / 6.4 / 6.3 / 6.2 / 6.1.1 / 6.1 / 6.0.1 / 6.0 / 5.1 implementation for YOLO models
MIT License
1.5k stars 361 forks source link

Yolov8 custom trained network error when generating .engine #313

Open globalcaos opened 1 year ago

globalcaos commented 1 year ago

When trying to build the .engine from a custom yolov8m network (via ',deepstream-app -c deepstream_app_config.txt') I get the following error:

Number of unused weights left: 26775 deepstream-app: yolo.cpp:426: NvDsInferStatus Yolo::buildYoloNetwork(std::vector&, nvinfer1::INetworkDefinition&): Assertion `0' failed.

I am following your instructions from https://github.com/marcoslucianops/DeepStream-Yolo/blob/master/docs/YOLOv8.md, but the error persists.

Strangely enough, starting from a freshly downloaded yolov8m.pt file, and converting it to deepstream with 'gen_wts_yoloV8.py -w best.pt -s 640', it works fine.

I have increased the number of classes from my original 15 to 80 and still I get the same error. I have also changed the '-s 640' parameter to other values, and not the custom network nor the freshly downloaded work, I get similar errors. I have compiled the libnvdsinfer_custom_impl_Yolo.so with different CUDA_VER, but still the same error.

The way I am training the network is the following:

model = YOLO(yolov8m_pt_filename) model.train(**params)

For 'params' I started with the ultralytics/ultralytics/yolo/cfg/default.yaml, and then made some changes:

Ultralytics YOLO 🚀, GPL-3.0 license

Default training settings and hyperparameters for medium-augmentation COCO training

task: detect # inference task, i.e. detect, segment, classify mode: train # YOLO mode, i.e. train, val, predict, export

Train settings -------------------------------------------------------------------------------------------------------

model: # path to model file, i.e. yolov8n.pt, yolov8n.yaml data: # path to data file, i.e. i.e. coco128.yaml epochs: 5 # number of epochs to train for patience: 250 # epochs to wait for no observable improvement for early stopping of training batch: 16 # number of images per batch (-1 for AutoBatch) imgsz: 1024 # size of input images as integer or w,h save: True # save train checkpoints and predict results save_period: -1 # Save checkpoint every x epochs (disabled if < 1) cache: False # True/ram, disk or False. Use cache for data loading device: # device to run on, i.e. cuda device=0 or device=0,1,2,3 or device=cpu workers: 8 # number of worker threads for data loading (per RANK if DDP) project: # project name name: # experiment name exist_ok: False # whether to overwrite existing experiment pretrained: True # whether to use a pretrained model optimizer: SGD # optimizer to use, choices=['SGD', 'Adam', 'AdamW', 'RMSProp'] verbose: True # whether to print verbose output seed: 0 # random seed for reproducibility deterministic: True # whether to enable deterministic mode single_cls: False # train multi-class data as single-class image_weights: True # use weighted image selection for training rect: False # support rectangular training if mode='train', support rectangular evaluation if mode='val' cos_lr: False # use cosine learning rate scheduler close_mosaic: 10 # disable mosaic augmentation for final 10 epochs resume: False # resume training from last checkpoint min_memory: False # minimize memory footprint loss function, choices=[False, True, ]

Segmentation

overlap_mask: True # masks should overlap during training (segment train only) mask_ratio: 4 # mask downsample ratio (segment train only)

Classification

dropout: 0.0 # use dropout regularization (classify train only)

Val/Test settings ----------------------------------------------------------------------------------------------------

val: True # validate/test during training split: val # dataset split to use for validation, i.e. 'val', 'test' or 'train' save_json: False # save results to JSON file save_hybrid: False # save hybrid version of labels (labels + additional predictions) conf: # object confidence threshold for detection (default 0.25 predict, 0.001 val) iou: 0.7 # intersection over union (IoU) threshold for NMS max_det: 300 # maximum number of detections per image half: False # use half precision (FP16) dnn: False # use OpenCV DNN for ONNX inference plots: True # save plots during train/val

Prediction settings --------------------------------------------------------------------------------------------------

source: # source directory for images or videos show: False # show results if possible save_txt: False # save results as .txt file save_conf: False # save results with confidence scores save_crop: False # save cropped images with results hide_labels: False # hide labels hide_conf: False # hide confidence scores vid_stride: 1 # video frame-rate stride line_thickness: 3 # bounding box thickness (pixels) visualize: False # visualize model features augment: False # apply image augmentation to prediction sources agnostic_nms: False # class-agnostic NMS classes: # filter results by class, i.e. class=0, or class=[0,2,3] retina_masks: False # use high-resolution segmentation masks boxes: True # Show boxes in segmentation predictions

Export settings ------------------------------------------------------------------------------------------------------

format: torchscript # format to export to keras: False # use Keras optimize: False # TorchScript: optimize for mobile int8: False # CoreML/TF INT8 quantization dynamic: False # ONNX/TF/TensorRT: dynamic axes simplify: False # ONNX: simplify model opset: # ONNX: opset version (optional) workspace: 4 # TensorRT: workspace size (GB) nms: False # CoreML: add NMS

Hyperparameters ------------------------------------------------------------------------------------------------------

lr0: 0.01 # initial learning rate (i.e. SGD=1E-2, Adam=1E-3) lrf: 0.01 # final learning rate (lr0 * lrf) momentum: 0.937 # SGD momentum/Adam beta1 weight_decay: 0.0005 # optimizer weight decay 5e-4 warmup_epochs: 3.0 # warmup epochs (fractions ok) warmup_momentum: 0.8 # warmup initial momentum warmup_bias_lr: 0.1 # warmup initial bias lr box: 7.5 # box loss gain cls: 0.5 # cls loss gain (scale with pixels) dfl: 1.5 # dfl loss gain fl_gamma: 0.0 # focal loss gamma (efficientDet default gamma=1.5) label_smoothing: 0.0 # label smoothing (fraction) nbs: 64 # nominal batch size hsv_h: 0.015 # image HSV-Hue augmentation (fraction) hsv_s: 0.7 # image HSV-Saturation augmentation (fraction) hsv_v: 0.4 # image HSV-Value augmentation (fraction) degrees: 20.0 # image rotation (+/- deg) translate: 0.1 # image translation (+/- fraction) scale: 0.5 # image scale (+/- gain) shear: 15.0 # image shear (+/- deg) perspective: 0.0 # image perspective (+/- fraction), range 0-0.001 flipud: 0.0 # image flip up-down (probability) fliplr: 0.5 # image flip left-right (probability) mosaic: 1.0 # image mosaic (probability) mixup: 0.0 # image mixup (probability) copy_paste: 0.0 # segment copy-paste (probability)

Custom config.yaml ---------------------------------------------------------------------------------------------------

cfg: # for overriding defaults.yaml

Debug, do not modify -------------------------------------------------------------------------------------------------

v5loader: False # use legacy YOLOv5 dataloader

Tracker settings ------------------------------------------------------------------------------------------------------

tracker: botsort.yaml # tracker type, ['botsort.yaml', 'bytetrack.yaml']

and in the code I do a few modifications so it does not give me errors:

del(params['task']) del(params['mode']) del(params['model']) params['data'] = data.yaml

And finally for 'data.yaml' I have tried the simple:

train: ./train/images val: ./valid/images test: ./test/images nc: 80

names: ['1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21', '22', '23', '24', '25', '26', '27', '28', '29', '30', '31', '32', '33', '34', '35', '36', '37', '38', '39', '40', '41', '42', '43', '44', '45', '46', '47', '48', '49', '50', '51', '52', '53', '54', '55', '56', '57', '58', '59', '60', '61', '62', '63', '64', '65', '66', '67', '68', '69', '70', '71', '72', '73', '74', '75', '76', '77', '78', '79', '80']

And I've also added what can be found in ultralytics/ultralytics/models/v8/yolov8m.yaml:

YOLOv8.0m backbone

backbone:

[from, repeats, module, args]

But the result is the same.

I have DeepStream 6.1 and Cuda Toolkit 12.0 installed, and are planning to transfer the network to a Jetson Xavier NX with Jetpack 4.6 with DeepStream 6.0, but I guess this will be another battle...

I honestly do not know what else to try. It seems to me that, when retraining the network, the structure of the weights change and the following conversions fail. Does anyone have a clue what I'm doing wrong? Thanks in advance

marcoslucianops commented 1 year ago

Can you send me the 1st epoch of your model to test here?

globalcaos commented 1 year ago

Sent through wetransfer. Here is the link to download, which will be active for 7 days I believe.

It contains the the cfg and wts files after training 5 epochs. I also attach the labels.txt

When doing 'deepstream-app -c deepstream_app_config.txt' I get the following error:

Number of unused weights left: 18446744073709502791 deepstream-app: yolo.cpp:426: NvDsInferStatus Yolo::buildYoloNetwork(std::vector&, nvinfer1::INetworkDefinition&): Assertion `0' failed.

globalcaos commented 1 year ago

I am trying to create the .engine from a yolov5m model with a jetson, and it fails in a similar way. It says:

Number of unused weights left: 18446744073709502791 cv_pipeline_jetson: yolo.cpp:437: NvDsInferStatus Yolo::buildYoloNetwork(std::vector&, nvinfer1::INetworkDefinition&): Assertion `0' failed. Aborted (core dumped)

Funny thing is that it works with a yolov5m6 model. I don't understand why.

Maybe I'm doing something wrong? I have trained a model for DeepStream 6.0, and I am compiling the yolo library following your instructions:

CUDA_VER=10.2 make -C nvdsinfer_custom_impl_Yolo

What else could it be?

globalcaos commented 1 year ago

Hi Marcos. Would you like me to send you the file again?

On Wed, Mar 1, 2023 at 12:37 PM Marcos Luciano @.***> wrote:

Can you send me the 1st epoch of your model to test here?

— Reply to this email directly, view it on GitHub https://github.com/marcoslucianops/DeepStream-Yolo/issues/313#issuecomment-1449965209, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACH7SARM2E6DLXCSGLGDD4DWZ4YI3ANCNFSM6AAAAAAVIQE2WI . You are receiving this because you authored the thread.Message ID: @.***>

marcoslucianops commented 1 year ago

Now you can use the ONNX conversion to DeepStream.

YOLOv8