SimonBlanke / Gradient-Free-Optimizers

Simple and reliable optimization with local, global, population-based and sequential techniques in numerical discrete search spaces.
https://simonblanke.github.io/gradient-free-optimizers-documentation
MIT License
1.21k stars 84 forks source link

search_space cannot be larger than 32? #21

Closed beckhamwjc closed 2 years ago

beckhamwjc commented 2 years ago

Describe the bug I'm using the ParticleSwarmOptimizer to find the parameters (not hyper-parameters) for a NN. So I first defined a naive NN model with PyTorch and I explicitly set every weight parameters of this NN as a parameter in the search_space of GFO. I took the form as:

`search_space = {

"00": np.arange(-100,100).tolist(), "01": np.arange(-100,100).tolist(), ... "48": np.arange(-100,100).tolist(),

}`

and begin the GFO search. The "00" to "48" are the weights of NN.

Code to reproduce the behavior

Error message from command line The error turns up as: "ValueError: maximum supported dimension for an ndarray is 32, found 49"

System information:

Additional context Is there any way to optimize more than 32 variables at the same time?

SimonBlanke commented 2 years ago

Hello @beckhamwjc,

this seems like a duplicate of SimonBlanke/Gradient-Free-Optimizers#19. The problem here is probably the same as in SimonBlanke/Gradient-Free-Optimizers#1. But I am not sure, because I cannot reproduce the error without a full example from you.

beckhamwjc commented 2 years ago

Hello @beckhamwjc,

this seems like a duplicate of #19. The problem here is probably the same as in #1. But I am not sure, because I cannot reproduce the error without a full example from you.

Yes, I think the problem I encountered is the same as #19 . I managed to make a demonstration of this issue:

############################################################ import torch import torch.nn as nn import numpy as np from hyperactive import Hyperactive from hyperactive.optimizers import ParticleSwarmOptimizer

optimizer = ParticleSwarmOptimizer( inertia=0.3, cognitive_weight=0.5, social_weight=0.5, temp_weight=0.3, rand_rest_p=0.05, population=12, )

def model(opt): demo = nn.Linear(5,10,bias=False) demo_para = demo.state_dict() for i in range(8): for j in range(5): pos = "{}".format(i) + "{}".format(j) demo_para['Linear.weight'][i,j] = torch.Tensor([opt[pos]])

demo.load_state_dict(demo_para)
loss = torch.Tensor([1])

return loss.detach().numpy()

search_space = { "00": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "01": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "02": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "03": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "04": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "10": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "11": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "12": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "13": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "14": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "20": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "21": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "22": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "23": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "24": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "30": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "31": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "32": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "33": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "34": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "40": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "41": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "42": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "43": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "44": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "50": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "51": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "52": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "53": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "54": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "60": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "61": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "62": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "63": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "64": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "70": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "71": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "72": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "73": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "74": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), }

demo_hyper = Hyperactive() demo_hyper.add_search(model, search_space, optimizer=optimizer, n_iter=1) demo_hyper.run()

############################################################

(github code blocks seems like a mess ...)

SimonBlanke commented 2 years ago

Hello @beckhamwjc,

From SimonBlanke/Gradient-Free-Optimizers#1 we know that the initialization via vertices and grid in a high dimensional search space creates this error. This error can be avoided by initializing via random positions:

optimizer = ParticleSwarmOptimizer(
  inertia=0.3,
  cognitive_weight=0.5,
  social_weight=0.5,
  temp_weight=0.3,
  rand_rest_p=0.05,
  population=12,
  initialize={"random": 12},
)

Also: make sure the number of initial positions is equal or bigger than the population size.

beckhamwjc commented 2 years ago
initialize={"random": 12},

Thanks. I introduced the "initialize={"random": 12}" into my ParticleSwarmOptimizer but got a TypeError: gradient_free...ParticleSwarmOptimizer( ) got multiple values for keyword argument 'initialize'

SimonBlanke commented 2 years ago

Hello @beckhamwjc,

I need a small example code that creates the TypeError to help you with this.

beckhamwjc commented 2 years ago

Hello @beckhamwjc,

I need a small example code that creates the TypeError to help you with this.

import torch import torch.nn as nn import numpy as np from hyperactive import Hyperactive from hyperactive.optimizers import ParticleSwarmOptimizer

optimizer = ParticleSwarmOptimizer( inertia=0.3, cognitive_weight=0.5, social_weight=0.5, temp_weight=0.3, rand_rest_p=0.05, population=12, initialize={"random": 12}, )

def model(opt): demo = nn.Linear(5,10,bias=False) demo_para = demo.state_dict() for i in range(8): for j in range(5): pos = "{}".format(i) + "{}".format(j) demo_para['Linear.weight'][i,j] = torch.Tensor([opt[pos]])

demo.load_state_dict(demo_para) loss = torch.Tensor([1])

return loss.detach().numpy()

search_space = { "00": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "01": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "02": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "03": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "04": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "10": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "11": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "12": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "13": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "14": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "20": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "21": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "22": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "23": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "24": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "30": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "31": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "32": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "33": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "34": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "40": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "41": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "42": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "43": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "44": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "50": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "51": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "52": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "53": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "54": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "60": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "61": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "62": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "63": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "64": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "70": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "71": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "72": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "73": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), "74": np.divide(np.concatenate((np.arange(-100,0),np.arange(1,101))), 100).tolist(), }

demo_hyper = Hyperactive() demo_hyper.add_search(model, search_space, optimizer=optimizer, n_iter=1) demo_hyper.run()

will results in: TypeError: gradient_free...ParticleSwarmOptimizer( ) got multiple values for keyword argument 'initialize'

SimonBlanke commented 2 years ago

Hello @beckhamwjc,

thank you for the example code! Without indentation I can only guess how the objective function loops looks like.

The code you provides is not from Gradient-Free-Optimizers, but from Hyperactive. In Hyperactive you pass the initialize-parameter to add_search(...). The optimizer-classes in Hyperactive only accepts optimizer-specific-parameters found in the optimization tutorial. Here is a (shorter) version of your code:

import torch
import torch.nn as nn
import numpy as np
from hyperactive import Hyperactive
from hyperactive.optimizers import ParticleSwarmOptimizer

optimizer = ParticleSwarmOptimizer(
    inertia=0.3,
    cognitive_weight=0.5,
    social_weight=0.5,
    temp_weight=0.3,
    rand_rest_p=0.05,
    population=12,
)

def model(opt):
    demo = nn.Linear(5, 10, bias=False)
    demo_para = demo.state_dict()
    for i in range(8):
        for j in range(5):
            pos = "{}".format(i) + "{}".format(j)
            demo_para["Linear.weight"][i, j] = torch.Tensor([opt[pos]])

    demo.load_state_dict(demo_para)
    loss = torch.Tensor([1])

    return loss.detach().numpy()

search_space = {
    "00": np.divide(
        np.concatenate((np.arange(-100, 0), np.arange(1, 101))), 100
    ).tolist(),
    "01": np.divide(
        np.concatenate((np.arange(-100, 0), np.arange(1, 101))), 100
    ).tolist(),
    "02": np.divide(
        np.concatenate((np.arange(-100, 0), np.arange(1, 101))), 100
    ).tolist(),
}

demo_hyper = Hyperactive()
demo_hyper.add_search(
    model,
    search_space,
    optimizer=optimizer,
    n_iter=1,
    initialize={"random": 12},
)
demo_hyper.run()

After I ran that piece of code I recognized a bug in the particle-class of the particle swarm optimizer. I fixed it in version 1.0.4

Now I get the error:

demo_para["Linear.weight"][i, j] = torch.Tensor([opt[pos]])
KeyError: 'Linear.weight'

But this problem does not seem to originate in Gradient-Free-Optimizers or Hyperactive.

Please update GFO to v1.0.4 and try to fix the code. That way we can verify if there is more to fix on my end.