plemeri / InSPyReNet

Official PyTorch implementation of Revisiting Image Pyramid Structure for High Resolution Salient Object Detection (ACCV 2022)
MIT License
321 stars 61 forks source link

Train with another shape #31

Closed chauthehan closed 11 months ago

chauthehan commented 1 year ago

Thank you for your great work. I want to train this on my custom dataset, but my images have width much higher than height, like 1300*48. Is that okay that I use the original config on my dataset , because the original config is for square image.

plemeri commented 1 year ago

Hello, you might want to change the shape as you need. Change Model.base_size, Train.Dataset.transforms.static_resize.size to something like $1024 \times 32$ since the network architecture favors a power of two. Also, for the inference stage, I recommend using static_resize than dynamic_resize since I'm not sure the performance for such extreme image aspect ratio. Try using the same resize method as training.


Model:
    name: "InSPyReNet_SwinB"
    depth: 64
    pretrained: True
    base_size: [384, 384] --> [1024, 32]
    threshold: 512

Train:
    Dataset:
        type: "RGB_Dataset"
        root: "data/Train_Dataset"
        sets: ['DUTS-TR'] --> [YOUR_DATASET_FOLDER_NAME]
        transforms:
            static_resize: 
                size: [384, 384] --> [1024, 32]
....

Test:
    Dataset:
        type: "RGB_Dataset"
        root: "data/Test_Dataset"
        sets:  ['DUTS-TE', 'DUT-OMRON', 'ECSSD', 'HKU-IS', 'PASCAL-S', 'DAVIS-S', 'HRSOD-TE', 'UHRSD-TE'] --> [YOUR_DATASET_FOLDER_NAME]
        transforms:
            dynamic_resize: --> static_resize:
                L: 1280     --> size: [1024, 32]

Thanks.