RangiLyu / nanodet

NanoDet-Plus⚡Super fast and lightweight anchor-free object detection model. 🔥Only 980 KB(int8) / 1.8MB (fp16) and run 97FPS on cellphone🔥
Apache License 2.0
5.63k stars 1.03k forks source link

How to train model with MobileNetV2? #533

Closed nijatmursali closed 9 months ago

nijatmursali commented 9 months ago

I have modified a config file to train the model with MobileNetV2 backbone as

    backbone:
      name: MobileNetV2
      model_size: 1.0x
      out_stages: [2,3,4]
      activation: LeakyReLU

but it gave TypeError: __init__() got an unexpected keyword argument 'model_size' error. I commented out the model_size parameter, but now it gives RuntimeError: Given groups=1, weight of size [96, 116, 1, 1], expected input[32, 32, 28, 28] to have 116 channels, but got 32 channels instead.

What is right way to train the model with MobileNetV2 or GhostNet?

Tried using #6, but it gives RuntimeError: The size of tensor a (1029) must match the size of tensor b (1045) at non-singleton dimension 1 error.

Full config file:

save_dir: workspace/nanodet-plus-m_224_mobilenet
model:
  weight_averager:
    name: ExpMovingAverager
    decay: 0.9998
  arch:
    name: GFL
    backbone:
      name: MobileNetV2
      out_stages: [2,4,6]
    fpn:
      name: PAN
      in_channels: [32, 96, 1280]
      out_channels: 96
      start_level: 0
      num_outs: 3
    head:
      name: NanoDetPlusHead
      num_classes: 80
      input_channel: 96
      feat_channels: 96
      stacked_convs: 2
      kernel_size: 5
      strides: [8, 16, 32, 64]
      activation: LeakyReLU
      reg_max: 7
      norm_cfg:
        type: BN
      loss:
        loss_qfl:
          name: QualityFocalLoss
          use_sigmoid: True
          beta: 2.0
          loss_weight: 1.0
        loss_dfl:
          name: DistributionFocalLoss
          loss_weight: 0.25
        loss_bbox:
          name: GIoULoss
          loss_weight: 2.0
    # Auxiliary head, only use in training time.
    aux_head:
      name: SimpleConvHead
      num_classes: 80
      input_channel: 192
      feat_channels: 192
      stacked_convs: 4
      strides: [8, 16, 32, 64]
      activation: LeakyReLU
      reg_max: 7
data:
  train:
    name: CocoDataset
    img_path: coco/train2017
    ann_path: reduced_by_dataset.json #reduced_annotations.json
    input_size: [224,224] #[w,h] 224x224
    keep_ratio: False
    pipeline:
      perspective: 0.0
      scale: [0.6, 1.4]
      stretch: [[0.8, 1.2], [0.8, 1.2]]
      rotation: 0
      shear: 0
      translate: 0.2
      flip: 0.5
      brightness: 0.2
      contrast: [0.6, 1.4]
      saturation: [0.5, 1.2]
      normalize: [[103.53, 116.28, 123.675], [57.375, 57.12, 58.395]]
  val:
    name: CocoDataset
    img_path: coco/val2017
    ann_path: coco/annotations/instances_val2017.json
    input_size: [224,224] #[w,h]
    keep_ratio: False
    pipeline:
      normalize: [[103.53, 116.28, 123.675], [57.375, 57.12, 58.395]]
device:
  gpu_ids: [0] # Set like [0, 1, 2, 3] if you have multi-GPUs
  workers_per_gpu: 4
  batchsize_per_gpu: 32 # train_dataset / batch_size
  precision: 32 # set to 16 to use AMP training
schedule:
#  resume:
#  load_model: workspace/nanodet-plus-m_224/model_last.ckpt
  optimizer:
    name: AdamW
    lr: 0.001
    weight_decay: 0.05
  warmup:
    name: linear
    steps: 500
    ratio: 0.0001
  total_epochs: 300
  lr_schedule:
    name: CosineAnnealingLR
    T_max: 300
    eta_min: 0.00005
  val_intervals: 10
grad_clip: 35
evaluator:
  name: CocoDetectionEvaluator
  save_key: mAP
log:
  interval: 50

class_names: ['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus',
              'train', 'truck', 'boat', 'traffic_light', 'fire_hydrant',
              'stop_sign', 'parking_meter', 'bench', 'bird', 'cat', 'dog',
              'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe',
              'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee',
              'skis', 'snowboard', 'sports_ball', 'kite', 'baseball_bat',
              'baseball_glove', 'skateboard', 'surfboard', 'tennis_racket',
              'bottle', 'wine_glass', 'cup', 'fork', 'knife', 'spoon', 'bowl',
              'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot',
              'hot_dog', 'pizza', 'donut', 'cake', 'chair', 'couch',
              'potted_plant', 'bed', 'dining_table', 'toilet', 'tv', 'laptop',
              'mouse', 'remote', 'keyboard', 'cell_phone', 'microwave',
              'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock',
              'vase', 'scissors', 'teddy_bear', 'hair_drier', 'toothbrush']
AnanasPizzaMigliore commented 2 months ago

how you solved this problem?