idstcv / ZenNAS

218 stars 35 forks source link

Two questions about the computation of zero score #28

Closed carabnuu closed 1 year ago

carabnuu commented 1 year ago

hi , the work is excellent! I'm curious and have two questions about computation of zen score.

  1. You computed the Frobenius norm in the original paper , while for the code in 'compute_zen_score.py' zen score is computed by 'torch.abs(output - mixup_output)'
  2. and I don't know why should we get the differential in this step ( refers to step 4 in algorithm 1). I found it's mentioned as 'This step replaces the gradient of x with finite differential ∆ to avoid backward-propagation.' in the original paper , can you give more explanation? thanks a lot!!
MingLin-home commented 1 year ago

Hi carabnuu, Thanks for feedback! 1) The gradient norm can be approximated by numerical differential. So we can use norm(f(x1)-f(x2)) / norm(x1-x2) to replace gradient norm 2) You can use any norm function actually. In our code we use L1-norm (abs) because it is faster to compute. Feel free to use L2-norm as in the paper, or any Lp-norm you would like.

carabnuu commented 1 year ago

thanks for your answering!That really helps a lot! but I still have a question about your answer 1 regrading step 4 in algorithm 1. This step uses numerical differential to replace the gradient of input x. But from the expression in the paper and 'torch.abs(output - mixup_output)' in the code as well,I can only see Lp-norm(f(x1)-f(x2)) , and don't have the denominator part norm(x1-x2). I don't know why. I might be stuck by this simple question. Thanks again!

MingLin-home commented 1 year ago

norm(x1-x2) is nearly a constant because x2 = x1 + epsilon. In high dimentional space, the norm of a random Gaussian is nearly a constant.

carabnuu commented 1 year ago

Thank you for your detailed reply !