Fix scale to work as documented

clpetix commented 5 months ago

See issue #251

clpetix commented 5 months ago

@mphoward I've fixed this bug and added a unit test to catch it in the future! I think the tests are failing related to numpy 2.0, so this should be good for you to look at when you get a chance!

mphoward commented 5 months ago

Thanks! This should fix the issue.

While investigating this and looking at the manuscript, I realized that the LineSearch is using unscaled variables. This is consistent with what is written in the documentation, but I think is inconsistent with the SteepestDescent using scaled variables to take steps. Could you look at this to confirm and see if you think we should add scale as an argument of find? We could default it to scale=1.0 for backward compatibility.

If you agree, I would try to fix that in this PR since it is a related issue.

clpetix commented 5 months ago

Thanks! This should fix the issue.

While investigating this and looking at the manuscript, I realized that the LineSearch is using unscaled variables. This is consistent with what is written in the documentation, but I think is inconsistent with the SteepestDescent using scaled variables to take steps. Could you look at this to confirm and see if you think we should add scale as an argument of find? We could default it to scale=1.0 for backward compatibility.

If you agree, I would try to fix that in this PR since it is a related issue.

So currently SteepestDescent is being updated using the scaled variables but LineSearch is checking based on unscaled variables? I think we would just need to add in scale=1.0 as suggested and turn: https://github.com/mphowardlab/relentless/blob/09505911377030e1db8786d6d8d412bbe785589e/src/relentless/optimize/method.py#L192 to x.value = start.variables[x] + scale**2 * new_step * d[x]. I think we need the scale**2 because the gradient doesn't have the extra factor from using scaled variables like it does in steepest descent here: https://github.com/mphowardlab/relentless/blob/09505911377030e1db8786d6d8d412bbe785589e/src/relentless/optimize/method.py#L400

Does this seem right to you, @mphoward ?

mphoward commented 5 months ago

I think we need to apply one scale factor to any gradient that is dotted against d, which turns it into the gradient with respect to the scaled parameters. See Eq. (18) of the draft manuscript.

I also think when we calculate d in the first place, we should be dividing by scale. That would calculate the descent direction in scaled coordinates. See Eqs. (17) and (18) of the draft manuscript.

If you scale d when you compute it, then I think you will need to multiply by one factor of scale in line 192 to make the update apply to the unscaled parameters.

clpetix commented 5 months ago

@mphoward that all makes sense! I think I have those changes implemented as you noted!

mphoward commented 5 months ago

Thanks, here are a few things I noticed:

The multiplication of the gradient by scale needs to be done on the gradient directly (not outside the dot) in case scale is a KeyedArray.
I think you may have missed some instances of dotting in find, so please search for those.
You should copy the logic from SteepestDescent for converting the scale (as given) into a KeyedArray with the right keys, in case. Alternatively, we need to document that scale either needs to be just a float or a properly keyed KeyedArray.

clpetix commented 5 months ago

@mphoward here are those changes!

mphowardlab / relentless

Fix scale to work as documented #252