PaddlePaddle / PaddleSeg

Easy-to-use image segmentation library with awesome pre-trained model zoo, supporting wide-range of practical tasks in Semantic Segmentation, Interactive Segmentation, Panoptic Segmentation, Image Matting, 3D Segmentation, etc.
https://arxiv.org/abs/2101.06175
Apache License 2.0
8.54k stars 1.68k forks source link

I'd like a bigger human segmentation model #3250

Open dganzella opened 1 year ago

dganzella commented 1 year ago

问题确认 Search before asking

请提出你的问题 Please ask your question

Hello!

I want to use a "intermediate" image segmentation model that is bigger than 192x192

image

I'd like to use a model with 512x512 size that is still <10mb. The one provided above is over 100MB and is too slow for realtime!!

Is there a pretrained human segmentation with size 512x512 that is peformative for realtime use? If not, How can I generate one? Is it possible to generate one?

Thank you!

juncaipeng commented 1 year ago

You can try to use PP-HumanSegV2 model with size 512x512.

dganzella commented 1 year ago

@juncaipeng how? wont it lose quality? wouldnt I need to generate from the beginning?

Would it be too much to ask for a "desktop" version of the model? 😬 it would be same backbone as PP-HumanSegV2 but with 512x512, but generated from scratch (the model will end up with a bigger file, right?)

to be able to run realtime desktop applications with HD video

dganzella commented 1 year ago

I tried

python ../../tools/train.py \
  --config configs/vsxr_lite.yml \
  --save_dir output/vsxr_lite \
  --save_interval 100 --do_eval --use_vdl

using this configs:

batch_size: 8
iters: 1000

train_dataset:
  type: Dataset
  dataset_root: data/mini_supervisely
  train_path: data/mini_supervisely/train.txt
  num_classes: 2
  transforms:
    - type: Resize
      target_size: [512, 512]
    - type: ResizeStepScaling
      scale_step_size: 0
    - type: RandomRotation
    - type: RandomPaddingCrop
      crop_size: [512, 512]
    - type: RandomHorizontalFlip
    - type: RandomDistort
    - type: RandomBlur
      prob: 0.3
    - type: Normalize
  mode: train

val_dataset:
  type: Dataset
  dataset_root: data/mini_supervisely
  val_path: data/mini_supervisely/val.txt
  num_classes: 2
  transforms:
    - type: Resize
      target_size: [512, 512]
    - type: Normalize
  mode: val

export:
  transforms:
    - type: Resize
      target_size: [512, 512]
    - type: Normalize

optimizer:
  type: sgd
  momentum: 0.9
  weight_decay: 0.0005

lr_scheduler:
  type: PolynomialDecay
  learning_rate: 0.001
  end_lr: 0
  power: 0.9

loss:
  types:
    - type: MixedLoss
      losses:
        - type: CrossEntropyLoss
        - type: LovaszSoftmaxLoss
      coef: [0.8, 0.2]
  coef: [1, 1, 1, 1]

model:
  type: PPLiteSeg
  backbone:
    type: STDC1  # [x2 x4 x8 x16 x32]
  cm_out_ch: 128
  backbone_indices: [1, 2, 3, 4]
  arm_out_chs: [4, 16, 32, 64]
  seg_head_inter_chs: [4, 16, 32, 64]

But it says the mIou(%) is 0.41%!! why is that? What am I doing wrong?