siyuhuang / TOD

PyTorch Implementation of Temporal Output Discrepancy for Active Learning, ICCV 2021
MIT License
39 stars 4 forks source link

Codes for generating Figure. 1 #2

Closed QiushiYang closed 3 years ago

QiushiYang commented 3 years ago

Thanks a lot for sharing codes of your interesting work. I wonder how to calculate and generate the Figure. 1? Would you like to share the codes about it? Besides, I have two another questions: (1) According to Corollary 2, it needs to consider the factors of eta and C (const) to estimate the losses, but it seems no related parameter in the codes. Does it show any influence of performance about it? (2) Did you tested and employed TOD on general semi-supervised learning task? Many thanks.

siyuhuang commented 3 years ago

It is very easy to get Fig. 1. The following is the code for computing the output gradient norm of a batch of samples.

scores = model(inputs)
scores = F.softmax(scores, dim=1)
scores = torch.sum(scores) / scores.size(0)
scores.backward()

total_norm = 0
for p in model.parameters():
        if p.grad is not None:
            param_norm = p.grad.data.norm(2)
            total_norm += param_norm.item() ** 2

Q(1): η is the learning rate. Both η and C are constant during training and testing, so they can be ignored in computing of TOD.

Q(2): We only test TOD-based semi-supervised training on active learning settings. For general semi-supervised training tasks, please refer to Mean Teacher [1], which is similar to the semi-supervised part of TOD.

[1] Antti Tarvainen and Harri Valpola. Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. In NIPS, pages 1195– 1204, 2017.

QiushiYang commented 3 years ago

Great! Thanks a lot for your kindly helps!