fudan-zvg / SeaFormer

[ICLR 2023] SeaFormer: Squeeze-enhanced Axial Transformer for Mobile Semantic Segmentation
290 stars 21 forks source link

cannot find spatial branch in model code #10

Open inderpreet-adapdix opened 1 year ago

inderpreet-adapdix commented 1 year ago

Hi,

I was looking into model implementation and found that Fusion_block in seaformer.py file is not returning anything in its forward pass https://github.com/fudan-zvg/SeaFormer/blob/db38fe7df1adec41ebf8125efe49184ad7396097/seaformer-cls/seaformer.py#L338-L368

also there seems to be error in its implementation:

While debugging the forward pass for seaformer-small model I could not find the use of the Fusion block and model weights for fusion block are also not present in checkpoint. It seems like model implementation only has shared stem and context branch.

Can you please help me with these issues if I m missing something?

Thanks

wwqq commented 1 year ago

Hi, @inderpreet-adapdix Fusion_block (Spatial branch) is not used in the classification task. We've deleted this class. It is referenced in the segmentation code. https://github.com/fudan-zvg/SeaFormer/blob/f64a6b5a5bfc72d91fbeae9ac4f54cccd72e5fe4/seaformer-seg/mmseg/models/decode_heads/light_head.py#L69

inderpreetsingh01 commented 1 year ago

Thanks for the reply @wwqq, I am actually using CPU and was able to get inference done on CPU for a classification model, I am not sure how to do it for the segmentation model, can you help me with it I want to evaluate seaformer-small for segmentation on CPU.

wwqq commented 1 year ago

The convert file is uploaded. https://github.com/fudan-zvg/SeaFormer/blob/main/convert2onnx.py run: python3 convert2onnx.py <config-file> --input-img <img-dir> --shape 512 512 --checkpoint <model-ckpt> to convert model to ONNX. test inference speed in mobile devices, please refer to TopFormer

inderpreetsingh01 commented 1 year ago

Hi @wwqq I followed the link you mentioned, I am doing inference for seaformer small model in jupyter notebook, I am using model SeaFormer-S_512x512_4x8_160k, I have initialized the model using config present in seaformer_small.py and loaded the weights. I am passing it an image from ADE20K dataset after resizing it to (512, 512) and result I am getting is of (64, 64). I have done bilinear interpolation to convert into size (512, 512) and mapped each class to a RGB color, the output image I am getting doesn't seem to be good and it seems image is not getting segmented. I have updated Conv2d_BN to use nn.BatchNorm2d instead of build_layer_norm. Here is the link to notebook https://github.com/inderpreetsingh01/models/blob/master/Seaformer_segmentation_inference.ipynb It would be helpful if you can help me with this, if I am doing something wrong here?

Thanks

wwqq commented 1 year ago

Hi @inderpreetsingh01 1.Logits need to be upsampled first and then argmax.

1

2.Label_colors need to generate 150 classes not 125. You can import it from the checkpoint directly.

1

3.Same here, nc=150. And rgb_image should transpose not reshape.

1

Final vis result:

1
inderpreetsingh01 commented 1 year ago

Thanks a lot, @wwqq for the clear explanation. I have one general doubt why the output seems a bit noisy, is there any way to improve it for the same checkpoint?

speedinghzl commented 1 year ago

Hi @inderpreetsingh01, thanks for your interest in our work. I saw you use the simple transform as the image preprocess, it is mismatched with the preprocess used in the Seaformer training step. The segmentation result could be better if you align the preprocess, including normalization, rgb order(not sure).

image image
inderpreetsingh01 commented 1 year ago

Hi @speedinghzl thanks for reply, I have normalized the image now using mean=[123.675, 116.28, 103.53] and std=[58.395, 57.12, 57.375], also I was using model in training mode I converted it to eval mode and below are the output I got.

ADE_val_00001001.jpg image

ADE_val_00001005.jpg image

It seems better than the previous one and less noisy. Can you please confirm if is this the expected output of the seaformer small model?

wwqq commented 1 year ago

Hi, @inderpreetsingh01 Yes, they are the expected outputs.