purse1996 commented 3 years ago

I want to train the GFFNet. I write a training script similar to resnet50-deeplabv3

!/usr/bin/env bash

now=$(date +"%Y%m%d_%H%M%S") EXP_DIR=./body_edge/DeepWV3PlusGFFNet mkdir -p ${EXP_DIR}

Example on Cityscapes by resnet50-deeplabv3+ as baseline

nohup python -m torch.distributed.launch --nproc_per_node=3 --master_port=23456 train.py \ --dataset cityscapes \ --cv 0 \ --arch network.gffnets.DeepWV3PlusGFFNet \ --class_uniform_pct 0.0 \ --class_uniform_tile 1024 \ --max_cu_epoch 150 \ --lr 0.01 \ --lr_schedule poly \ --poly_exp 1.0 \ --repoly 1.5 \ --rescale 1.0 \ --snapshot wider_resnet38.pth.tar \ --syncbn \ --sgd \ --crop_size 832 \ --scale_min 0.5 \ --scale_max 2.0 \ --color_aug 0.25 \ --gblur \ --max_epoch 100 \ --wt_bound 1.0 \ --bs_mult 1 \ --apex \ --exp cityscapes_ft \ --ckpt ${EXP_DIR}/ \ --tb_path ${EXP_DIR}/ \ --seed 345 \ --cuda_index '4,5,6' \

${EXPDIR}/log${now}.txt 2>&1 &

But the training result is only 0.77848, which is much lower than the paper. Can you offer some suggestions?

lxtGH commented 3 years ago

@purse1996 GFFnet needs two stage training which means you should restore the form the pretrained model with lower learning rate. Also your bs is too small.
Also according my experiments on Cityscapes, bs must >=8 to ahieves the baseline results.

purse1996 commented 3 years ago

Thank you for your kind reply. But can you explain in detail Or provide the training yml file. Additionally, the code predicts the weight of every channel in each layer, while the paper predicts a gate map for all the channel in the same layer. So can you explain it?

lxtGH commented 3 years ago

I will provide the trained configs file in one month.

purse1996 commented 3 years ago

Now there is 8 RTX 3090, can you introduce it in detail and I will try it on my server.

lxtGH / DecoupleSegNets

GFFNet #17

!/usr/bin/env bash

Example on Cityscapes by resnet50-deeplabv3+ as baseline