solving complex nonlinear optimization with evotorch

richinex commented 2 years ago

I have this complex nonlinear optimization problem which I would like to approach with evotorch. I tried to get some clues from the examples but my use case was slightly different.

My first question concerns the bounds: How do I specify the bounds for each parameter since certain parameters cannot exceed certain bounds. How do I parallelize the runs? It takes too long for about 5000 iterations Which solver would be best for my problem. Currently I use SNES.

My example is below. Thanks.

import numpy as np
import torch
from evotorch import Problem
from evotorch.algorithms import SNES
from evotorch.logging import StdOutLogger

# Load in data
F = torch.tensor([1.0000e-01, 1.3000e-01, 1.6900e-01, 2.2000e-01, 2.8600e-01, 3.7100e-01,
        4.8300e-01, 6.2700e-01, 8.1600e-01, 1.0600e+00, 1.3800e+00, 1.7900e+00,
        2.3300e+00, 3.0300e+00, 3.9400e+00, 5.1200e+00, 6.6500e+00, 8.6500e+00,
        1.1200e+01, 1.4600e+01, 1.9000e+01, 2.4700e+01, 3.2100e+01, 4.1800e+01,
        5.4300e+01, 7.0600e+01, 9.1700e+01, 1.1900e+02, 1.5500e+02, 2.0200e+02,
        2.6200e+02, 3.4100e+02, 4.4300e+02, 5.7600e+02, 7.4800e+02, 9.7300e+02,
        1.2600e+03, 1.6400e+03, 2.1400e+03, 2.7800e+03, 3.6100e+03, 4.7000e+03,
        6.1000e+03, 7.9400e+03, 1.0300e+04, 1.3400e+04, 1.7400e+04, 2.2700e+04,
        2.9500e+04, 3.8300e+04, 4.9800e+04, 6.4700e+04, 8.4200e+04, 1.0900e+05],
       dtype=torch.float64)

Z = torch.tensor([45.1000-1.0900j, 47.5000-1.4300j, 46.8000-1.7700j, 46.2000-2.2900j,
        46.2000-2.9700j, 47.2000-3.8000j, 47.0000-4.8500j, 45.1000-5.9900j,
        45.8000-7.3300j, 42.3000-9.0500j, 42.6000-10.2000j, 36.5000-10.8000j,
        34.5000-11.2000j, 32.1000-10.2000j, 30.0000-9.1800j, 29.4000-8.0000j,
        27.3000-6.6400j, 26.7000-5.1800j, 25.3000-4.1200j, 25.4000-3.2600j,
        25.2000-2.5100j, 24.9000-1.9400j, 24.9000-1.6400j, 25.4000-1.3500j,
        25.5000-1.2400j, 24.8000-1.1000j, 24.7000-1.0300j, 23.9000-1.0400j,
        25.2000-1.1000j, 24.9000-1.2700j, 25.0000-1.4600j, 25.4000-1.6500j,
        24.4000-1.9800j, 24.5000-2.3400j, 24.5000-2.9100j, 23.8000-3.4700j,
        22.9000-4.1300j, 22.3000-4.9100j, 20.9000-5.6600j, 20.3000-6.0300j,
        18.4000-6.9600j, 17.6000-7.2400j, 16.5000-7.7400j, 14.3000-7.4200j,
        12.7000-7.1700j, 11.2000-6.7600j,  9.8500-5.8900j,  8.6800-5.3800j,
         7.9200-4.5300j,  7.2000-3.8300j,  6.8100-3.2000j,  6.6500-2.6700j,
         6.1100-2.1600j,  5.8600-1.7700j], dtype=torch.complex128)

sigma = torch.tensor([45.1132, 47.5215, 46.8335, 46.2567, 46.2954, 47.3527, 47.2496, 45.4960,
        46.3829, 43.2573, 43.8041, 38.0643, 36.2724, 33.6816, 31.3731, 30.4690,
        28.0959, 27.1978, 25.6333, 25.6084, 25.3247, 24.9755, 24.9539, 25.4359,
        25.5301, 24.8244, 24.7215, 23.9226, 25.2240, 24.9324, 25.0426, 25.4535,
        24.4802, 24.6115, 24.6722, 24.0516, 23.2694, 22.8341, 21.6528, 21.1767,
        19.6724, 19.0310, 18.2252, 16.1104, 14.5842, 13.0820, 11.4767, 10.2121,
         9.1240,  8.1553,  7.5244,  7.1660,  6.4806,  6.1215],
       dtype=torch.float64) 

p0 = torch.tensor([5, 0.000103, 1, 20, 0.001, 20])

# Define bounds such that parameter in inx 2 is between 0.1 and 1, while parameters 1 and 4 are between 1e-9 and 1e-1
lb = np.full((len(p0),),1e-15)
lb[1] = 1e-9
lb[2] = 0.1
lb[4] = 1e-9
ub = np.full((len(p0),),1e15)
ub[1] = 1e-1
ub[2] = 1.01
ub[4] = 1e-1

# Define model
def fun(p, x):
    w = 2*torch.pi*x
    s = 1j*w
    Rs = p[0]
    Qh = p[1]
    nh = p[2]
    Rct = p[3]
    C1 = p[4]
    R1 = p[5]
    Y1 = s*C1 + 1/R1
    Z1 = 1/Y1
    Zct = Rct + Z1
    Ydl = (s**nh)*Qh
    Yin = Ydl + 1/Zct
    Zin = 1/Yin
    Z = Rs + Zin
    return torch.concat((Z.real, Z.imag),dim = 0)

# Define loss function
def obj_fun(p, x, y, yerr):
    ndata = len(x)
    dof = (2*ndata-(len(p)))
    y_concat = torch.concat([y.real, y.imag], dim = 0)
    sigma = torch.concat([yerr,yerr], dim = 0)
    y_model = fun(p, x)
    chi_sqr = (1/dof)*(torch.sum(torch.abs((1/sigma**2) * (y_concat - y_model)**2)))
    return (chi_sqr)

# Define problem
problem = Problem(
  "min",
  lambda p0: obj_fun(p0, F, Z, sigma),
  initial_bounds=[lb, ub],
  solution_length=len(p0),
)

searcher = SNES(problem, popsize=1000, stdev_init=2.0)
_ = StdOutLogger(searcher, interval=50)

searcher.run(num_generations=5000)

# popt expected  = 5.2658e+00, 7.4640e-06, 8.2708e-01, 1.9907e+01, 3.4077e-03, 2.1928e+01

# Results
# There seems to be no change in the loss
#          iter : 50
#     mean_eval : 3.1693422474173666e+25
# pop_best_eval : 3.169341555664464e+25
#   median_eval : 3.169341555664464e+25
#     best_eval : 3.169341555664464e+25
#    worst_eval : 7.474293084492361e+26

#          iter : 100
#     mean_eval : 3.1693422474173666e+25
# pop_best_eval : 3.169341555664464e+25
#   median_eval : 3.169341555664464e+25
#     best_eval : 3.169341555664464e+25
#    worst_eval : 7.474293084492361e+26

engintoklu commented 2 years ago

Hello @richinex,

Thanks for trying out EvoTorch!

Regarding the optimization problem, we noticed that the scales of the decision variables differ greatly. Indeed, according to the lower and upper bounds (lb and ub in the code), while the scale for some variables can go up to 1e+15, for some others it is as low as around 1e-1. In such situations, we recommend re-formulating the problem in such a way that the variables are all within the same scale (e.g. [0.0 ... 1.0]).

While we think that SNES is a powerful algorithm, along with XNES and PGPE, it was proposed with mainly un-bounded optimization in mind. Although you could still handle strict boundaries with these algorithms using tricks such as clipping and/or penalizing, we recommend SteadyStateGA which has built-in support for strict boundaries.

We also noticed that the variables lb and ub and some of the problem constants are defined with dtype float64. You can also set the problem's dtype as torch.float64 so that the solutions and their fitnesses will have the same level of accuracy.

For getting a performance boost, you might want to provide the keyword argument num_actors=N while instantiating the problem object, where N can be an integer specifying the number of CPUs you would like to utilize in parallel. Alternatively, you could specify num_actors="max" for utilizing all CPUs of your computer.

Below is a modified copy of your code with our suggestions incorporated. Would you like to give this a try? I observed that this modified code brings the fitness down to around 0.0349. In fact, even with the popsize reduced to 100 (from 1000), I observed that the fitness went down to around 0.035, in 50 generations.

Finally, in this modified code, you might want to tune the hyperparameters of the genetic algorithm operators: the tournament size, the eta value of the simulated binary cross-over, and the standard deviation of the mutation operator. Perhaps with some fine-tuning, the fitness might be reduced further.


import numpy as np
import torch
from evotorch import Problem
from evotorch.algorithms import SteadyStateGA
from evotorch.operators import SimulatedBinaryCrossOver, GaussianMutation
from evotorch.logging import StdOutLogger

# Load in data
F = torch.tensor([1.0000e-01, 1.3000e-01, 1.6900e-01, 2.2000e-01, 2.8600e-01, 3.7100e-01,
        4.8300e-01, 6.2700e-01, 8.1600e-01, 1.0600e+00, 1.3800e+00, 1.7900e+00,
        2.3300e+00, 3.0300e+00, 3.9400e+00, 5.1200e+00, 6.6500e+00, 8.6500e+00,
        1.1200e+01, 1.4600e+01, 1.9000e+01, 2.4700e+01, 3.2100e+01, 4.1800e+01,
        5.4300e+01, 7.0600e+01, 9.1700e+01, 1.1900e+02, 1.5500e+02, 2.0200e+02,
        2.6200e+02, 3.4100e+02, 4.4300e+02, 5.7600e+02, 7.4800e+02, 9.7300e+02,
        1.2600e+03, 1.6400e+03, 2.1400e+03, 2.7800e+03, 3.6100e+03, 4.7000e+03,
        6.1000e+03, 7.9400e+03, 1.0300e+04, 1.3400e+04, 1.7400e+04, 2.2700e+04,
        2.9500e+04, 3.8300e+04, 4.9800e+04, 6.4700e+04, 8.4200e+04, 1.0900e+05],
       dtype=torch.float64)

Z = torch.tensor([45.1000-1.0900j, 47.5000-1.4300j, 46.8000-1.7700j, 46.2000-2.2900j,
        46.2000-2.9700j, 47.2000-3.8000j, 47.0000-4.8500j, 45.1000-5.9900j,
        45.8000-7.3300j, 42.3000-9.0500j, 42.6000-10.2000j, 36.5000-10.8000j,
        34.5000-11.2000j, 32.1000-10.2000j, 30.0000-9.1800j, 29.4000-8.0000j,
        27.3000-6.6400j, 26.7000-5.1800j, 25.3000-4.1200j, 25.4000-3.2600j,
        25.2000-2.5100j, 24.9000-1.9400j, 24.9000-1.6400j, 25.4000-1.3500j,
        25.5000-1.2400j, 24.8000-1.1000j, 24.7000-1.0300j, 23.9000-1.0400j,
        25.2000-1.1000j, 24.9000-1.2700j, 25.0000-1.4600j, 25.4000-1.6500j,
        24.4000-1.9800j, 24.5000-2.3400j, 24.5000-2.9100j, 23.8000-3.4700j,
        22.9000-4.1300j, 22.3000-4.9100j, 20.9000-5.6600j, 20.3000-6.0300j,
        18.4000-6.9600j, 17.6000-7.2400j, 16.5000-7.7400j, 14.3000-7.4200j,
        12.7000-7.1700j, 11.2000-6.7600j,  9.8500-5.8900j,  8.6800-5.3800j,
         7.9200-4.5300j,  7.2000-3.8300j,  6.8100-3.2000j,  6.6500-2.6700j,
         6.1100-2.1600j,  5.8600-1.7700j], dtype=torch.complex128)

sigma = torch.tensor([45.1132, 47.5215, 46.8335, 46.2567, 46.2954, 47.3527, 47.2496, 45.4960,
        46.3829, 43.2573, 43.8041, 38.0643, 36.2724, 33.6816, 31.3731, 30.4690,
        28.0959, 27.1978, 25.6333, 25.6084, 25.3247, 24.9755, 24.9539, 25.4359,
        25.5301, 24.8244, 24.7215, 23.9226, 25.2240, 24.9324, 25.0426, 25.4535,
        24.4802, 24.6115, 24.6722, 24.0516, 23.2694, 22.8341, 21.6528, 21.1767,
        19.6724, 19.0310, 18.2252, 16.1104, 14.5842, 13.0820, 11.4767, 10.2121,
         9.1240,  8.1553,  7.5244,  7.1660,  6.4806,  6.1215],
       dtype=torch.float64) 

p0 = torch.tensor([5, 0.000103, 1, 20, 0.001, 20])

# Define bounds such that parameter in index 2 is between 0.1 and 1, while parameters 1 and 4 are between 1e-9 and 1e-1
lb = np.full((len(p0),),1e-15)
lb[1] = 1e-9
lb[2] = 0.1
lb[4] = 1e-9
ub = np.full((len(p0),),1e15)
ub[1] = 1e-1
ub[2] = 1.01
ub[4] = 1e-1

lb = torch.as_tensor(lb, dtype=torch.float64)
ub = torch.as_tensor(ub, dtype=torch.float64)
gap = ub - lb

def decode(x: torch.Tensor) -> torch.Tensor:
    # This function rescales the solution x (which is initially between 0 and 1).
    # After the decoding, the new scale will be according to lb and ub.
    #x = torch.clamp(x, 0.0, 1.0)  # This clamping is needed when using SNES
    x = (x * gap) + lb
    return x

# Define model
def fun(p, x):
    w = 2*torch.pi*x
    s = 1j*w
    Rs = p[0]
    Qh = p[1]
    nh = p[2]
    Rct = p[3]
    C1 = p[4]
    R1 = p[5]
    Y1 = s*C1 + 1/R1
    Z1 = 1/Y1
    Zct = Rct + Z1
    Ydl = (s**nh)*Qh
    Yin = Ydl + 1/Zct
    Zin = 1/Yin
    Z = Rs + Zin
    return torch.concat((Z.real, Z.imag),dim = 0)

# Define loss function
def obj_fun(p, x, y, yerr):
    p = decode(p)

    ndata = len(x)
    dof = (2*ndata-(len(p)))
    y_concat = torch.concat([y.real, y.imag], dim = 0)
    sigma = torch.concat([yerr,yerr], dim = 0)
    y_model = fun(p, x)
    chi_sqr = (1/dof)*(torch.sum(torch.abs((1/sigma**2) * (y_concat - y_model)**2)))
    return (chi_sqr)

# Define problem.
problem = Problem(
    "min",
    lambda p0: obj_fun(p0, F, Z, sigma),
    bounds=[0.0, 1.0],  # NOTE: Not initial_bounds. These are strict boundaries.
    solution_length=len(p0),
    dtype=torch.float64,
    #num_actors=4,  # If you wish to use 4 CPUs in parallel
    #num_actors="max",  # If you wish to use all your CPUs
)

# Here we are instantiating a SteadyStateGA searcher.
# SteadyStateGA is a fully elitist genetic algorithm. Therefore, some solutions can stay
# and survive in the population for many generations. By setting `re_evaluate` as False,
# we tell our genetic algorithm that it is not necessary to re-evaluate the same
# solutions (because it seems that the fitness function in this case is deterministic).
# If the fitness function was stochastic, we would instead recommend to leave
# `re_evaluate` in its default state, which is True.
searcher = SteadyStateGA(problem, popsize=1000, re_evaluate=False)

# The following GA operators respect strict boundaries.
searcher.use(
    SimulatedBinaryCrossOver(
        problem,
        tournament_size=8,
        eta=20,
    )
)

searcher.use(
    GaussianMutation(
        problem,
        stdev=0.1,
    )
)

_ = StdOutLogger(searcher, interval=1)
searcher.run(num_generations=50)

best_solution = searcher.status["best"].values.clone()

print("Best solution (before decoding)", best_solution)
print("Best solution (after decoding)", decode(best_solution))

richinex commented 2 years ago

Hi Engintoklu,

Thank you for the detailed response. I tinkered around a little bit and finally used SNES with initial bounds and got the optimal parameters. However I rescaled the parameters by converting from external to internal parameters and back similar to the approach you used. Finally, I would to ask if there are any examples that show how to tune the hyperparameters.

Here is the final code:

# Load in data
F = torch.tensor([1.0000e-01, 1.3000e-01, 1.6900e-01, 2.2000e-01, 2.8600e-01, 3.7100e-01,
        4.8300e-01, 6.2700e-01, 8.1600e-01, 1.0600e+00, 1.3800e+00, 1.7900e+00,
        2.3300e+00, 3.0300e+00, 3.9400e+00, 5.1200e+00, 6.6500e+00, 8.6500e+00,
        1.1200e+01, 1.4600e+01, 1.9000e+01, 2.4700e+01, 3.2100e+01, 4.1800e+01,
        5.4300e+01, 7.0600e+01, 9.1700e+01, 1.1900e+02, 1.5500e+02, 2.0200e+02,
        2.6200e+02, 3.4100e+02, 4.4300e+02, 5.7600e+02, 7.4800e+02, 9.7300e+02,
        1.2600e+03, 1.6400e+03, 2.1400e+03, 2.7800e+03, 3.6100e+03, 4.7000e+03,
        6.1000e+03, 7.9400e+03, 1.0300e+04, 1.3400e+04, 1.7400e+04, 2.2700e+04,
        2.9500e+04, 3.8300e+04, 4.9800e+04, 6.4700e+04, 8.4200e+04, 1.0900e+05],
       dtype=torch.float64)

Z = torch.tensor([45.1000-1.0900j, 47.5000-1.4300j, 46.8000-1.7700j, 46.2000-2.2900j,
        46.2000-2.9700j, 47.2000-3.8000j, 47.0000-4.8500j, 45.1000-5.9900j,
        45.8000-7.3300j, 42.3000-9.0500j, 42.6000-10.2000j, 36.5000-10.8000j,
        34.5000-11.2000j, 32.1000-10.2000j, 30.0000-9.1800j, 29.4000-8.0000j,
        27.3000-6.6400j, 26.7000-5.1800j, 25.3000-4.1200j, 25.4000-3.2600j,
        25.2000-2.5100j, 24.9000-1.9400j, 24.9000-1.6400j, 25.4000-1.3500j,
        25.5000-1.2400j, 24.8000-1.1000j, 24.7000-1.0300j, 23.9000-1.0400j,
        25.2000-1.1000j, 24.9000-1.2700j, 25.0000-1.4600j, 25.4000-1.6500j,
        24.4000-1.9800j, 24.5000-2.3400j, 24.5000-2.9100j, 23.8000-3.4700j,
        22.9000-4.1300j, 22.3000-4.9100j, 20.9000-5.6600j, 20.3000-6.0300j,
        18.4000-6.9600j, 17.6000-7.2400j, 16.5000-7.7400j, 14.3000-7.4200j,
        12.7000-7.1700j, 11.2000-6.7600j,  9.8500-5.8900j,  8.6800-5.3800j,
         7.9200-4.5300j,  7.2000-3.8300j,  6.8100-3.2000j,  6.6500-2.6700j,
         6.1100-2.1600j,  5.8600-1.7700j], dtype=torch.complex128)

sigma = torch.tensor([45.1132, 47.5215, 46.8335, 46.2567, 46.2954, 47.3527, 47.2496, 45.4960,
        46.3829, 43.2573, 43.8041, 38.0643, 36.2724, 33.6816, 31.3731, 30.4690,
        28.0959, 27.1978, 25.6333, 25.6084, 25.3247, 24.9755, 24.9539, 25.4359,
        25.5301, 24.8244, 24.7215, 23.9226, 25.2240, 24.9324, 25.0426, 25.4535,
        24.4802, 24.6115, 24.6722, 24.0516, 23.2694, 22.8341, 21.6528, 21.1767,
        19.6724, 19.0310, 18.2252, 16.1104, 14.5842, 13.0820, 11.4767, 10.2121,
         9.1240,  8.1553,  7.5244,  7.1660,  6.4806,  6.1215],
       dtype=torch.float64) 

# Define the model
def fun(p, x):
    w = 2*torch.pi*x
    s = 1j*w
    Rs = p[0]
    Qh = p[1]
    nh = p[2]
    Rct = p[3]
    C1 = p[4]
    R1 = p[5]
    Y1 = s*C1 + 1/R1
    Z1 = 1/Y1
    Zct = Rct + Z1
    Ydl = (s**nh)*Qh
    Yin = Ydl + 1/Zct
    Zin = 1/Yin
    Z = Rs + Zin
    return torch.concat((Z.real, Z.imag),dim = 0)

# Define loss function
def obj_fun(p, x, y, yerr, lb, ub):
    ndata = len(x)
    dof = (2*ndata-(len(p)))
    p_norm =  (lb + 10**(p)) / (1 + 10**(p) / ub)
    y_concat = torch.concat([y.real, y.imag], dim = 0)
    sigma = torch.concat([yerr,yerr], dim = 0)
    y_model = fun(p_norm, x)
    chi_sqr = (1/dof)*(torch.sum(torch.abs((1/sigma**2) * (y_concat - y_model)**2)))
    return (chi_sqr)

def encode(x: torch.Tensor, lb, ub) -> torch.Tensor:
    # Convert to internal parameters
    x =  torch.log10((x - lb) / (1 - x / ub))
    return x

def decode(x: torch.Tensor, lb, ub) -> torch.Tensor:
    # Convert to external parameters
    x =  (lb + 10**(x)) / (1 + 10**(x) / ub)
    return x

p0 = torch.tensor([5, 0.000103, 0.75, 20, 0.001, 20])
# lb = np.zeros((2*order+1))
lb = torch.full((len(p0),),1e-15)
lb[2] = (0.1)
ub =  torch.full((len(p0),),1e15)
ub[2] = (1.01)

p_log = encode(p0, lb, ub)

problem = Problem(
  "min",
  lambda p_log: obj_fun(p_log, F, Z, sigma, lb, ub),
  initial_bounds=[-1, 1],
  solution_length=len(p0),
  # vectorized=True,
  # device="cuda",  # Enable for GPU support
)

searcher = SNES(problem, popsize=100, stdev_init=0.1)
_ = StdOutLogger(searcher, interval=500)

searcher.run(num_generations=500)

#         iter : 500
#     mean_eval : 0.00020434820908121765
# pop_best_eval : 0.0002043374115601182
#   median_eval : 0.00020434764155652374
#     best_eval : 0.0002043374115601182
#    worst_eval : 0.8582084774971008
# popt_log = searcher.status['center']
# popt = decode(popt_log, lb, ub)

# Exactly what I wanted
# popt = 5.2866e+00, 7.2562e-06, 8.2976e-01, 1.9868e+01, 3.4013e-03, 2.1942e+01

flukeskywalker commented 2 years ago

Hi @richinex. Great that you got it to work! For tuning the hyperparameters of the algorithms, we suggest you look at the examples of dedicated tuning libraries such as ray.tune or Optuna. Now that you can evaluate any hyperparameter configuration by running the searcher, it should be possible to integrate with them easily.

richinex commented 2 years ago

Ok cool. Thanks.

nnaisense / evotorch

solving complex nonlinear optimization with evotorch #17