Gradient analysis - Githubissues

Lilyo commented 4 years ago

Hi, @tztztztztz ! How to collect average L2 norm of gradient of weight? Could you provide the detail step for gen the Fig.1 in paper. (It is better if there are source code!)

Thank a lot!

tztztztztz commented 4 years ago

You can use another weight term to filter out the gradients you don't want to collect.

For example, suppose the number of samples and the number of classes are 5 and 4 respectively. And the gt_label is [0, 1, 2, 3, 3] Now if you want to collect gradient from only positive samples. the weight would be: 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 1 Then you can collect the L2 norm of gradients of weights after each backward using a hook.

Similarly, if you want to collect gradients from only negative samples. the weight would be: 0 1 1 1 1 0 1 1 1 1 0 1 1 1 1 0 1 1 1 0

So, the detailed step is:

Determine what type of gradient you want to collect and the corresponding weight
Resume the model from a ckpt
Run the model for several epoch and collect the L2 norm after each backward (you may want not to update the model by setting learning rate to 0),

Lilyo commented 4 years ago

It's clear now! But I'm not sure if my understanding is correct. Based on your description, I feed data into a fixed weight model, then collect the gradients of (positive/negative) samples in the last classifier layer by using label as class-wise weight, is it? Thank you for your reply!

tztztztztz commented 4 years ago

I'm not sure what "using label as class-wise weight" means. You should first calculate the binary cross-entropy loss (or EQL loss), and multiply it with the weight term which I mentioned above. Then you backward the loss function and collect the L2 norm of gradients for each class.

Lilyo commented 4 years ago

I will try to reproduce the results, thx.

tztztztztz / eql.detectron2

Gradient analysis #3