XuJiacong / PIDNet

This is the official repository for our recent work: PIDNet
MIT License
596 stars 109 forks source link

Question about changing image size #72

Open KKaiYH opened 1 year ago

KKaiYH commented 1 year ago

The problem I encountered is that I changed the train image size 512 in the yaml file to train model, and I also changed the test image size, but when I actually ran the testing, the image size of the test didn't change according to the settings in the yaml file, so I would like to ask if there is any error in this part of the content ?

As shown below, because the testing image size is defined in multi_scale_aug in the resize part, but multi_scale=False in test_dataset, so it doesn't do resize at all ?

Q1

Q2

image

lpetflo commented 1 year ago

That's correct, PIDNet only performs resizing if multi_scale is True. However I personally wouldn't call this in issue as I want to test my predictions on full size whatsoever. I'm currently training on a reduced scale, but am guessing that the model is correctly generalizing to the original size due to the training with multiple scales. If you really wanted to test on downscaled versions of your dataset you would have to change the code.

KKaiYH commented 12 months ago

Thank you for your reply ! I would like to ask about the training you tried on the reduced size, did you try to test it on the same reduced size ? I'm currently having problems with a significant drop in accuracy if I test my predictions on the reduced size, providing some of my experimental data in hopes of further discussion with you. If you have any related experiments, hope you can share them with me. Thank you ! image

lpetflo commented 12 months ago

Yes, I actually did some testing on 50% prediction as well as using upscaling for full-scale predictions. Using PIDNet and my highly imbalanced binary dataset, I was able to get 71.92% mIoU when evaluating the 50% downscaled predictions on the training I did at that 50% size (using the multi-scale option however). Using the authors' native full-scale approach, I could only get 70.48%, which was in line with my expectations. Using TDNet as a segmentation model (which doesn't use native full-scale predictions), I got the same 1-2% degradation when using full-scale predictions instead of the 50% predictions I trained it on. Due to my limited ressources I couldn't try full-scale training so your findings are still quite interesting for me!