Closed a76yyyy closed 2 years ago
Who can answer my questions?
The flipout paper uses the rademacher distribution, that is core to the derivation of the algorithm. E.g. in Observation 1:
Let E be a random sign matrix that is independent of ∆dW. Then ∆W = ∆dW ◦ E is identically distributed to ∆dW.
This would not work with a uniform distribution.
In this paper, The original content is "r and s are random vectors whose entries are sampled uniformly from ±1"
I think this should refer to uniform distribution rather than Rademacher distribution.
Is it my understanding errors, or which article explains the effectiveness of the Rademacher Distribution?