GilesStrong / tomopt

TomOpt: Differential Muon Tomography Optimisation
GNU Affero General Public License v3.0
4 stars 0 forks source link

PanelOptConfig & CostCoefWarmup don't account for momentum or optimiser history #142

Closed GilesStrong closed 1 year ago

GilesStrong commented 1 year ago

Current state

CostCoefWarmup and PanelOptConfig both act by 'freezing' detector optimisation for a few warm-up epochs in order to set suitable aspects of the loss and optimisers based on the initial state of the detector. They currently implement freezing by setting the optimiser learning rates to zero.

Potential problems

Setting the LR to zero works fine for SGD without momentum, however if the optimiser has momentum, it can still accumulate momentum and implement changes to the detectors through updates based on this. Additionally, some optimisers, e.g. Adam & RMSProp, track a history of gradients which are used adapt the effective learning rate and momentum coefficients used in future updates, so even setting the momentum rate to zero won't help (nor would setting grads to zero).

Proposed solution

AbsDetectorLoss and PanelOptConfig don't alter the optimiser hyper-parameters, instead they set a flag in the volume wrapper fit_params that tells the wrapper to skip the optimiser step calls.