nomuramasahir0 / crfmnes

(CEC2022) Fast Moving Natural Evolution Strategy for High-Dimensional Problems
https://arxiv.org/abs/2201.11422
MIT License
16 stars 1 forks source link

A simple fix for h_inv overflow with large dimensions #4

Closed nhansendev closed 1 year ago

nhansendev commented 1 year ago

I noticed that as the number of dimensions grows in CRFMNES(dim, ...) the values calculated by "f" during the get_h_inv(dim) calculation reach magnitude ~10^300 or so, eventually creating an overflow condition for high enough dim values. This also causes the calculation to take many iterations to converge as it first "explodes", then very gradually approaches the 1e-10 target value.

A simple fix that I've found is to replace the initial h_inv value with anything between 2 and 10 (6 seems like a good value), which rapidly converges to the same final values without exploding.

I suggest implementing this change unless there are drawbacks that I've missed.

nhansendev commented 1 year ago

Also, for some "dim" values abs(f(h_inv, dim)) will never reach the 1e-10 target (ex: dim = 800000 -> ~2.5e-9). I suggest either an iteration limit resulting in an error, or track the last value of h_inv and exit early when it stops changing.

# Inside the while loop of get_h_inv:
last = h_inv
h_inv = ...
if abs(h_inv - last) < 1e-16:
    # Exit early since no further improvements are happening
    break
nomuramasahir0 commented 1 year ago

@Obliman Thank you for reporting the important problem! In fact, we have only conducted experiments up to 2000 dimensions, and have not investigated the behavior in higher dimensions. How many dimensions do you know that there is a problem with behavior? We may need more investigations to make this improvement.

nhansendev commented 1 year ago

I've run some quick trials tracking the value of h_ind during the while loop in get_h_inv. Note that I am using the early exiting check suggested above to approximate convergence and avoid extreme numbers of iterations.

The legend shows pairs of (initial value of h_ind, dim): h_ind

For very small dim values it may converge faster when h_ind starts at 1 rather than 5, but this quickly changes as dim grows. Any run where an OverflowError occurs is omitted, which includes dim >= ~2048 when h_ind starts at 1. For comparison you can see how consistent the runs are when h_ind starts at 5 (dashed lines), allowing much higher values for dim.

Here you can see how f(h_inv, dim) explodes, then re-converges over time when h_inv starts at 1 (f_prime(h_inv) has similar behavior). Note the y-axis is changed to log-scale: h_ind_numerator

nomuramasahir0 commented 1 year ago

Fixed. Thank @Obliman again! If you encounter any problems again, I would be very happy if you could report it with an issue or create a PR.

Best,