The details of "samples one sub-graph at one training iteration"

D-X-Y / AutoDL-Projects

Automated deep learning algorithms implemented in PyTorch.

MIT License

1.56k stars 281 forks source link

The details of "samples one sub-graph at one training iteration" #12

Closed kongxz closed 5 years ago

kongxz commented 5 years ago

Can you talk about the details of "samples one sub-graph at one training iteration"?

As far as I know, the result of Gumbel Softmax may not be a one hot vector. It may be a vector like [0.96, 0.01, 0.01, 0.01, 0.01].

When you sample one sub-graph at training, do you just drop all the connections with weights 0.01?

Thanks.

D-X-Y commented 5 years ago

Sure, we use the hard mode and thus it is a one-shot vector. Something like this in PyTorch:

y_soft = Gumbel Softmax( ... )
y_hard = one_hot( y_soft )
y_hard = y_hard - y_soft.detach() + y_soft

During the forward, you could use:

cals = []
for i, w in enumerate(y_hard):
  if w.item() == 1:
    cals.append( op[i](x) * w )
  else:
    cals.append( x )
return sum(cals)

coolKeen commented 5 years ago

Sure, we use the hard mode and thus it is a one-shot vector. Something like this in PyTorch:
y_soft = Gumbel Softmax( ... )
y_hard = one_hot( y_soft )
y_hard = y_hard - y_soft.detach() + y_soft

How do you implement the backward process?

D-X-Y commented 5 years ago

If you implement forward in the above style, it can automatically backward in PyTorch.

brdav commented 4 years ago

Sure, we use the hard mode and thus it is a one-shot vector. Something like this in PyTorch:
y_soft = Gumbel Softmax( ... )
y_hard = one_hot( y_soft )
y_hard = y_hard - y_soft.detach() + y_soft
During the forward, you could use:
cals = []
for i, w in enumerate(y_hard):
  if w.item() == 1:
    cals.append( op[i](x) * w )
  else:
    cals.append( x )
return sum(cals)

Thanks for the code snippet! Why would you append x for the paths with weight 0? Shouldn't there be no forward propagation?