voldemortX / DST-CBC

Implementation of our Pattern Recognition paper "DMT: Dynamic Mutual Training for Semi-Supervised Learning"
BSD 3-Clause "New" or "Revised" License
133 stars 17 forks source link

When I run the semi supervised section,The program reported an error,could you help #3

Closed userhr2333 closed 2 years ago

userhr2333 commented 3 years ago
userhr2333 commented 3 years ago

FileNotFoundError: [Errno 2] No such file or directory: ' ../dataset/cityscapes/gtF inePseudo/ train/krefeld/krefeld.00000_ 021814 gtFine _labeLIds . npy'

I want to know how I can obtain a .npy file?

voldemortX commented 3 years ago

@useunreal The npy files should be the pseudo labels (because they contain float values to represent model confidence). I guess the problem is you did not attain pseudo label before semi-supervised training with them. Concretely, DMT starts with fully-supervised learning of 2 Baselines, pseudo labeling, training, pseudo labeling ... Please refer to this file for detailed usage on Cityscapes.

userhr2333 commented 3 years ago

The excitement in my heart cannot be expressed in words, thank you very, very much, I have solve this problem now.

voldemortX commented 3 years ago

The excitement in my heart cannot be expressed in words, thank you very, very much, I have solve this problem now.

You're welcome

userhr2333 commented 3 years ago

Now I want to train my onw dataset, could you give me some hints?

voldemortX commented 3 years ago

@useunreal I've not used my code on customized datasets, but I'd recommend matching your own dataset to a already provided dataset, which would be the easiest and you could even avoid writing much code. For instance, match your dataset's folder names and annotation format to the Cityscapes dataset.

Or if you want a full support of your own dataset, you could try write your own Dataset class like this one.

And don't forget to modify the dataset settings here.

As long as you provide the same loaded images & labels to the training functions, I think you can support your own dataset quite easily.

EDIT: One possible problem with matching Cityscapes: Cityscapes labels need a mapping function to become 0~18, if your own labels are normal, you should get rid of the LabelMap in transforms.

userhr2333 commented 3 years ago

Thank you, I will follow your tips to do, hope to have a good result

voldemortX commented 3 years ago

Good luck!

userhr2333 commented 3 years ago

@voldemortX Sorry to disturb you again. I found a strange phenomenon in the training process. The Miou in the middle of each epoch's training will be very low, but the Miou in the last round of each epoch will be very high. for example, This is the result in the middle of the epoch global correct: 81.75 average row correct: ['17.86', '8.52', '81.79', '9.36', '78.80', '71.58', '77.55', '64.03', '50.85', '92.75'] IoU: ['2.84', '7.81', '60.03', '8.76', '66.57', '53.51', '59.87', '52.39', '39.29', '84.12'] mean IoU: 43.52 This is the last result of each epoch: global correct: 92.61 average row correct: ['70.83', '69.37', '92.89', '65.99', '91.53', '90.67', '89.14', '60.24', '68.60', '96.26'] IoU: ['66.16', '51.98', '87.27', '51.69', '85.93', '84.43', '78.98', '54.54', '61.19', '92.00'] mean IoU: 71.42 Epoch time: 579.15s

Why are they so different?

voldemortX commented 3 years ago

@useunreal If you mean at the middle and end of each mutual training iteration, it is possible since with high learning rate models first start to diverge (and that's a good thing since you want them to be different so they can learn from each other).

I remember something like a miou gap up to 10-20 depending on the dataset. Your result seems a bit extreme, if it is your own dataset then maybe lower learning rate can bring better performance.

But if it happens in each epoch, it does seem weird. BTW, is it train miou or Val miou? What I said above is mainly about Val miou in mutual training.

userhr2333 commented 3 years ago

I am using my own data set. This result is val miou. I just visualized the final result. I trained two networks, one is pre-trained using the coco data set, and the other is not pre-trained. The visualized result shows The results of pre-training are very poor, almost all wrong, but without pre-training, only the first stage (full supervision) effect is already very good.

BTW, I imitated your program and converted the original coco pre-selected training model into a model suitable for my program. The code is as follows (my data set is divided into 10 categories): hung_coco_filename = 'resnet101COCO-41f33a49.pth' coco = torch.load(hung_coco_filename) seg_net = deeplab_v2(num_classes=10)

my_seg = seg_net.state_dict().copy()

seg_shape_not_match = 0 seg_shape_match = 0

for key in coco: if 'layer5' in key: my_key = 'classifier.0.convs' + key.split('conv2d_list')[1] else: my_key = 'backbone.' + key

if my_seg[my_key].shape == coco[key].shape:
    seg_shape_match += 1
    my_seg[my_key] = coco[key]
else:
    seg_shape_not_match += 1

print(str(seg_shape_match) + ' seg shapes matched!') print(str(seg_shape_not_match) + ' seg shapes are not a match.')

print('Saving models...')

seg_net.load_state_dict(my_seg)

save_checkpoint(net=seg_net, optimizer=None, lr_scheduler=None, is_mixed_precision=False, filename='seg_coco_resnet101.pt') print('Complete.')

voldemortX commented 3 years ago

COCO pre-training can be harmful for some datasets. If your no pre-training baseline is already good, you can just use 2 randomly initialized models.

I'll inspect your weight conversion code later.

voldemortX commented 3 years ago

@useunreal I think your coco weight conversion code is correct. Did it print the same matched weights number as Cityscapes?

Before diving into ssl, perhaps you should first tune for best learning rate with fully supervised baselines on different init schemes (random, ImageNet, coco) to make sure if pre-training helps your dataset. In my code, ImageNet pre-traing is used by default similar to torchvision. Refer to segmentation.models.segmentation.segmentation.py around line 21.

In my experience, imagnet pre-training almost always helps segmentation.

userhr2333 commented 3 years ago

@voldemortX The result of coco pre training is very high, Miou can reach more than 70%, but the result of visualization is very poor. For the model without pre training, the result is about 68%, but the effect of visualization is very good. Here I use the learning rate is 0.001, but in the process of training, occasionally there will be some tips:Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 32768.0 Next, I'll follow your tips and test without pre training

voldemortX commented 3 years ago

your miou results seem very normal. I guess it could be a problem with your visualization process, did you use --coco both in training and vis?

loss scaled to very large values in mix precision is perfectly normal.

userhr2333 commented 3 years ago

Great. As you said, I used -- coco in training, but I didn't use -- coco in testing, resulting in wrong results. I just added -- coco in testing, and it was normal after visualization.

voldemortX commented 3 years ago

Then I guess you don't really need to re-tune learning rate. Although a slight tuning might yield better results.

userhr2333 commented 3 years ago

Thank you for your suggestion. I adjusted the learning rate because the initial 0.004 will often appear in my data set: Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to xxxxx When I changed to 0.001, things got better.

voldemortX commented 2 years ago

It seems this issue is resolved, I'll close for now. Feel free to reopen.