Closed TNA8 closed 2 years ago
Hi @TNA8,
Thank you for your kind words.
I hope this helps.
Thanks for your kind reply.
I have a dataset with one class; a pair of input image(contains many objects of one class) and labeled mask.
To apply your approach, I have to create SegmentationClassAug
and saliency_unsupervised_model
from the original masks. And then to train the model, what specific configurations are needed for this one class problem?
Could you help me?
Best regards,
Firstly, to create SegmentationClassAug
, you should put all the labels under a directory with this name. To create saliency_unsupervised_model
, you need to first download BAS-NET. Then from the following link you can download the model that we trained with the previously mentioned method: BAS-NET pretrained. As explained in BAS-NET, you should then create saved_models/basnet_bsi/
directory and copy basnet.pth here. After that you can put your images under test_data/test_images/
and run the basnet by the command python basnet_test.py
to get the unsupervised saliency estimations for your images under the directory test_data/test_results/
. Finally you can put these results under a directory named saliency_unsupervised_model
. For masks under both SegmentationClassAug
and saliency_unsupervised_model
, the names should match the names of your images except for extension(i.e. JPEGImages/2008_000008.jpg", "SegmentationClassAug/2008_000008.png", and "saliency_unsupervised_model/2008_000008.png").
For the configuration, Here is a sample:
DATA:
train_split: 0
sup_aug: True
query_aug: True
image_size: 400
use_all_classes: False
use_split_coco: False
train_name: pascal
test_name: default
test_split: default
train_list: lists/pascal/train_masksplit.txt # change to your own training list
data_root: # do not forget to add path to root
val_list: lists/pascal/val.txt # change to your own validation list
workers: 4
vcrop_range: [-40,40]
vcrop_ignore_support: True
alternate: True
vsplit_prob: 1.0
hsplit_prob: 0.0
hsplit: False
vsplit: True
num_classes_val: 1
TRAIN:
ckpt_path: checkpoints/
batch_size: 16
epochs: 100
strategy: "unsupervised_fbf"
EVALUATION:
shot: 1
visualize: False
ckpt_used: ""
MODEL:
arch: resnet
layers: 101
pretrained: True # Means the backbone has been pre-trained
model_name: Masksplit
DISTRIBUTED:
gpus: 0
In this configuration, change train_list, val_list and data_root. Moreover, in the src/dataset/dataset.py
, replace line 68 with class_list = [1]
. Similarly in the same file, replace lines 269-273 with the line class_list = [1]
.
One final thing is that, I think there has been a recent update to pytorch lightning. So please use the version given in the requirements.
I hope this helps. Edit: num_classes_val should be set to 1 also in config file.
Thanks a lot for your detailed explanation. I will try it and will be back with the result. :)
One question, this approach is based on self-supervised learning, a.k.a unsupervised learning. To my knowledge, self-supervised learning trains a model on unlabeled data by creating pseudo labels. After than we can use the pretrained model to downstream task using labeled data.
Why do we need labeled data for training?
Thanks.
The ground truth Masks are only used for validation and testing purposes. copy_paste_loader
function provides the dataloader that we use for training and if you check the code, you can see it only provides saliency for the training.
Moreover, our approach does not exactly create a pretrained model that can be used for downstream task. It directly creates a few-shot segmentation model. Given an image, we first divide the saliency estimation in half, using a line with a slope. Then we apply two set of augmentations to obtain a task, on which we can train our model. Basically, the goal here is to given the half of the saliency mask, to predict the other half. Our experiments show that applying different augmentations to support and query, combined with using different halves of the saliency estimation for support and query, enables us to generalize to few-shot task.
I tried to explain our approach as clear as possible. For further details, you can also check our paper, which is available at https://arxiv.org/abs/2110.12207.
Thanks. For training, pairs of input image and its saliency image is enough. so we don't need to have mask for all images. Some masks for test and validate are enough. Am I right?
Exactly. However, while creating the image list files, we added the saliency names after the image and ground truth mask paths, so the parser expects some string.
("JPEGImages/2008_000008.jpg SegmentationClassAug/2008_000008.png saliency_unsupervised_model/2008_000008.png"
the one in the middle). Since this is not necessary, the code can be updated so that it only uses a text file for image path and saliency path. Other solution could be to provide a simple string, since it is not used. That is a line in your train_list.txt file could be
"JPEGImages/2008_000008.jpg abc saliency_unsupervised_model/2008_000008.png"
.
Thanks for your excellent project.
I have a question about PASCAL dataset.
SegmentationClassAug
andsaliency_unsupervised_model
?SegmentationClassAug
contains only outline of classes. How can we label pixels inside of a class? For example, in this image, only outline of the class is marked.Thanks in advance.