Tensor-Reloaded / Convergence

Artificial Neural Network training convergence boosting by smart data ordering
Academic Free License v3.0
0 stars 0 forks source link

Experiment Relation to Lottery Ticket #6

Open simi2525 opened 4 years ago

simi2525 commented 4 years ago

Attempt the classic Lottery Ticket method of training a large model, pruning it and than retraining the subnet from a previous state forward.

Try to train the subnet from a random initialization but with "idea" order of training (brute force it)

The current ideas around tickets are that the initialization of a given subnet is what is the defining feature of the ticket, the best way of training the subnet currently being IMP with Rewinding technique.

Perhaps by having optimal order of training it would minimize the effect of the initialization of the subnet.

Lets see if the difference in accuracy between random initialization of subnet + perfect order vs IMP with Rewinding technique is reduced.

simi2525 commented 4 years ago

In the 2019 Frankle et al paper, the researchers presented a new technique called Iterative Magnitude Pruning (IMP) with Rewinding.

Instead of looking at the weights of neurons at initialization, rewinding to iteration zero, it instead looks at their weights after several training iterations aka rewinding to iteration k.

The rationale behind the technique is that in many architectures, some neurons are already at a “winning” weight at the beginning while others only reach a “winning” weight after some training.

The same rationale for using perfect ordering being that we increase a given ticket's change of reaching a "wining" state after some training