Heuristically provide an initial step size for the conjugate gradient steps that will automatically scale with problem size and other parameters. Previously, the initial step length for the backtracking line search was fixed to 1. This causes many backtracks or an insufficiently aggressive updates.
Approach
The initial step size is estimated to be 2 * |F*Fm| / |m| where F is the forward operator, F* is the adjoint operator, m are the inputs to the operator that will be updated by this step, and || signifies the complex L2 norm.
Initially, I was trying to make all of the operator in tike 'normalized' operators, but 'normalized operators' adjoint and forward operators must span the same space. This means that operators like the radon transform cannot be 'normalized' for arbitrary geometries. I also considered making all the operators intrinsically 'scaled' so that |F*Fm| / |m| == 1 is always true. However, this seems to be a bad/difficult approach because it is incompatible with certain properties of the operators. For example, the forward radon transform should be equivalent to line integrals across the field of view, but this property is not compatible with the aforementioned scaling.
Pre-Merge Checklists
Submitter
[x] Write a helpfully descriptive pull request title.
[x] Organize changes into logically grouped commits with descriptive commit messages.
[x] Document all new functions.
[x] Write tests for new functions or explain why they are not needed.
Purpose
Heuristically provide an initial step size for the conjugate gradient steps that will automatically scale with problem size and other parameters. Previously, the initial step length for the backtracking line search was fixed to 1. This causes many backtracks or an insufficiently aggressive updates.
Approach
The initial step size is estimated to be
2 * |F*Fm| / |m|
whereF
is the forward operator,F*
is the adjoint operator,m
are the inputs to the operator that will be updated by this step, and||
signifies the complex L2 norm.Initially, I was trying to make all of the operator in tike 'normalized' operators, but 'normalized operators' adjoint and forward operators must span the same space. This means that operators like the radon transform cannot be 'normalized' for arbitrary geometries. I also considered making all the operators intrinsically 'scaled' so that
|F*Fm| / |m| == 1
is always true. However, this seems to be a bad/difficult approach because it is incompatible with certain properties of the operators. For example, the forward radon transform should be equivalent to line integrals across the field of view, but this property is not compatible with the aforementioned scaling.Pre-Merge Checklists
Submitter
yapf
to format python code.Reviewer