robertfeldt / BlackBoxOptim.jl

Black-box optimization for Julia
Other
439 stars 56 forks source link

Support for additional SearchRange types #136

Open CrashBurnRepeat opened 5 years ago

CrashBurnRepeat commented 5 years ago

I've been looking for a Julia package for multi-objective optimization, and BlackBoxOptim.jl looks like a great option; I'm looking forward to trying out BorgMOEA in particular. However, the problems I have in mind for the package use a variety of design variable types. For instance, using an automotive analogy, I'd like to be able to have designs with a continuous mass, a discrete number of transmission gear ratios, and categorical choices of body materials.

The main additional types I can think of would be:

Indeed, it's possible all these types could rely on the Set{T} type for dispatch in addition to the currently used Tuple{Float,Float}. My rationale for Set instead of a regular array is for its guarantee of unique values. Tuple is out because it would lead to ambiguous situations, and I did not foresee an appetite to change the existing API that dramatically. However, I'm not completely sure having only the information that a SearchRange is a set and that it's of type T is sufficient to properly handle all cases, but it seems like a good start? It also has the advantage of not being overly complicated.

For clarity's sake, this could look like:

res = bboptimize(rosenbrock2d; SearchRange = Set([-5:5...]), NumDimensions = 2)

or

res = bboptimize(favoritecolors; SearchRange = Set(["red", "blue", "green"]), NumDimensions = 2)

Admittedly, I'm coming to this from a GA perspective, so I may well be missing complications in other algorithms. From what I understand of DE, constructing new designs seems to be complicated by adding non-continuous variables, but I will defer to someone with more experience in such algorithms.

From what I've seen of the encapsulating code as well, there may be complications with random design generation and the SearchSpace logic, particularly around mutation functions.

Perhaps as an initial effort, only the algorithms/mutation schemes that can "obviously" handle discrete variables should be supported, while the rest throw "not supported" errors? An example of "obvious" support would be uniform mutation, as rand can work directly on Set. An example of "non-obvious" could be either DE or polynomial mutation.

@robertfeldt, you mentioned that there might be some recent PRs that enable at least integers, but I haven't seen anything in the pull requests that seems to fit. Did I miss it? It still seems to be not supported in master.

I was considering putting together a branch to test some implementations of these types, but I wanted to start the discussion here to see if there are things I've missed, better ideas that you all may have, or other people interested in such an effort. All are welcome, so please let me know. :)

robertfeldt commented 5 years ago

I'm pretty sure @alyst added the option of having Ints as decision variables some time back, but I haven't tried it myself. Maybe he can comment.

I think a big challenge here will be that most/many optimization methods cannot easily work with non-continuous decision variables. Take DE as one example, it needs to do vector arithmetic on the decision vector. Not at all clear how this can/should work for non-reals. You mention DE, can you give examples how it can be extended to handle discrete vars? I'm sure there is a way but not clear that solutions will generalize.

OTOH, I do agree it would be great to be able to optimize over more interesting and complex search spaces.

For more specific comments on your proposal, it is not clear to me how the Set-based representation would be used when there is a mix of different ranges/specifications for different variables? Just a unique set for each one?

Overall, I'm positive but would like to hear more "voices" on how we can/should do this. Would be great if you can spearhead this effort and I'll try to help.

robertfeldt commented 5 years ago

Here's a link to the "mapping" approach, i.e. still using Floats in the background while mapping them into the search space of choice:

https://github.com/robertfeldt/BlackBoxOptim.jl/issues/108

Problem is that mapping non-ordered, discrete types is non-trivial. There are many ways of doing it in the literature but the risk is that one introduces artificial order and thus makes it harder to take certain steps.

alyst commented 5 years ago

There's some support for mixing continuous and discrete variables implemented in #111 (with a bit of a discussion why it was implemented this way): when creating RectSearchSpace you can specify dimdigits parameter that defines the "precision" in each dimension. I had one multi-objective problem with ~1000 weight-like dimensions, and used the discreteness to reduce the redundancy of the Pareto frontier solutions (i.e. w[i] = 0.100 and w[i] = 0.101 are essentially the same and having both of them on the frontier just slows down the evolution). That's one example where discreteness is beneficial from the computational perspective, I believe in many other cases the support for extended types is just a matter of convenience.

In general, it would be nice if one can use any struct MyType specific to a given problem and implement fitness or problem-specific mutation/embedding/crossover operators using this structure. However, implementing generic operators that can deal with arbitrary structs is quite complex task (probably still doable with the extensive use of macros). As a workaround I typically implement pack!(v::AbstractVector, x::MyType) and unpack!(x::MyType, v::AbstractVector) operations that do the conversion, so e.g. fitness(v::AbstractVector) unpacks v into temporary x::MyType and then calls fitness(x). It adds some overhead, but AFAIR it's never above 5%.

robertfeldt commented 5 years ago

Thanks for the clarification @alyst !

In your unpack, have you ever had to (and if so how do you) map the floats to discrete, non-ordered types (like enums etc)? One way I can see as possibly effective is to use multiple floats to embed the discrete points and then map to the closest one. This way one can avoid the obvious problems of introducing artificial closeness to discrete values that are not really close (or even ordered).

alyst commented 5 years ago

In your unpack, have you ever had to (and if so how do you) map the floats to discrete, non-ordered types (like enums etc)?

Mmm, I think so far I had no opt.problems with non-ordered types. What I had was a problem with a variable number of sub-objects, each sub-object encoded by a vector of a fixed length, and no good ordering of sub-objects possible. So for operations like crossover between a and b I had to implement some way to match sub-objects of a and b (I used Gale-Shapley algorithm), but sometimes its results were counter-productive for the optimization convergence.

jpsamaroo commented 5 years ago

I came to this repo to file an issue, and noticed that this one exactly fit what I was looking for :)

I'm also interested in providing for a more complicated search space, where some dimensions are not specified as floating point ranges, but are some mix of Sets, integer ranges, and other I can't recall right now. My interests stem from a desire to use BorgMOEA or another suitable algorithm to provide the MLJ ecosystem with a hyperparameter tuning algorithm. Some of those hyperparameters will be things like integer ranges or unordered sets (with elements being Symbols, custom structs, etc.).

I'd be happy to do the work to implement this (for all of BBO, or even if just for Borg), and would appreciate pointers on what high-level approach I should take to do so.

CrashBurnRepeat commented 5 years ago

@robertfeldt my thought is that, for an initial effort, only methods that can support discrete operations in a straightforward way would be supported. So perhaps just the GAs with certain mutation operators to start, with additional methods added as good solutions are found for them, assuming such solutions exist. Mapping might also be a good "first cut" option, perhaps with caveats in the documentation that there may be numerical oddities that could potentially crop up.

For more specific comments on your proposal, it is not clear to me how the Set-based representation would be used when there is a mix of different ranges/specifications for different variables? Just a unique set for each one?

As far as the usage on the API level, my thought is to extend the current SearchRange syntax with sets. For instance, if x[1] is continuous, x[2] is integer valued, and x[3] is categorical, the syntax could read

SearchRange = [(1,5), Set([1:5...]), Set(["red", "green", "blue"])]

So continuous variables continue to use the same syntax, with only discrete ones using sets. Does this address your question, or were you asking about other implementation details?

I also need to look through the RectSearchSpace approach @alyst mentioned to understand that better, as well as the rationale for its implementation.

As far as approaches for DE that can handle discrete values without numerical artifacts that might arise from conversion to/from floats, I'm reading through a paper that seems to have an interesting solution, but I haven't read it thoroughly enough yet to really judge it.

The arbitrary struct operations are really interesting, I'll have to give this some thought as well :)

robertfeldt commented 5 years ago

Thanks for your interest both @CrashBurnRepeat and @jpsamaroo ! Let's discuss some options/designs first and then have an overall "strategy of attack"; I'll try to support as best I can.

I think one concrete, overall solution might be to focus first on making this work for Borg MOEA, as you say, since that is a very general algorithm for MO and should be able to adapt well if just search space specific genetic operators are added. If/when we make this happen we can then discuss ways to extend it for BBO overall. The alternative would be to try to be "design upfront" heavy with the risk of getting stuck on finding very general solutions to cover many optimization algorithms, some who are not suited to this task.

More specific questions to you:

@jpsamaroo will you need to support varying-length decision vectors, i.e. where the number of decision variables are determined by some subset of the decision variables themselves? This type of situation is considerably more complex IMHO. At least it would be good if we can assume a known max size of the decision vector but maybe there are situations where this will be very inefficient. Do you have some sense of what typicaly use cases on your side would look?

@CrashBurnRepeat why use a Set construct for integer ranges like 1:5? Why not simply use the UnitRange{Int64} datum as is? To me it seems that using Tuple{Float64} for float ranges, UnitRange{Int64} for int ranges, and, possibly, StepRangeLen to cover @alyst "discretized" float ranges seems natural. And then only use Set for enums / non-ordered discrete sets. Or have I missed something?

For now, it would be interesting to support @alyst simpler Struct-mapping approach with some simple ways to map this more general search space specification to a float decision vector that can be optimized by the current/normal BBO algorithms. This would also allow us to know how "bad" this simpler approach is and use it as a benchmark for more advanced implementations.

While we do this I'd be happy to co-author a short technical note of the different approaches and designs we try. We can put it on arxiv once things settle, for example.

robertfeldt commented 5 years ago

This paper also proposes a way to extend DE to discrete decision variable problems: https://www.tandfonline.com/doi/full/10.1080/0305215X.2013.836637

alyst commented 5 years ago

In principle, BorgMOEA and the other methods use the same API for fitness calculation (fitness(), Evaluator) and the same mutation/embedding/crossover operators. So extending search spaces and allowed datatypes would affect all the methods.

Mapping-to-float-vector approach seems like the easiest. We can add some utility functions to efficiently encode and decode sets/integers/unordered etc into and from float-vector as well as the efficient mutation/crossover operations for these datatypes.

Allowing anything else than AbstractVector{Float64} as a function argument would be a very nice feature, but I think it would require deep refactoring of the package. When individuals are vectors, the whole population could be efficiently stored as a matrix. Many evolutionary operators use this fact for efficient implementation. One potential direction might be compound search spaces with elements of type Tuple{D1, D2, D3, ...}, where Di is one of the supported datatypes (Vector{<:Real}, Set{Int} etc). Then, in comparison to arbitrary structs, it would be easy to compose operators that work with these spaces. It's similar to the idea of @CrashBurnRepeat, just the Tuple{} approach allows type inference and generation of a more efficient code.

robertfeldt commented 5 years ago

Yes, I think the mapping-to-float-vector approach as a simple intermediate solutions as well as baseline for more advanced solutions make sense.

@alyst can you expand on how any new "efficient mutation/crossover operations" would work since they would presumably require knowledge about the datatypes from which the floatvector they need to work on was mapped? Seems to go against the overall mapping-to-float-vector idea or what am I missing? Do you mean we could put together new operations from generically defined ones that are specific to the orig SearchSpace from which the vector was mapped? Hmm, ok, that might work out quite nicely.

alyst commented 5 years ago

The Tuple idea was just one potential way to natively (without float vector mapping) support extended data types, if we are really keen to follow this direction.

I agree that starting with mapping-to-vector seems easier. At the moment I don't know whether it is possible to implement efficient automatic packing/unpacking on the BBO side. I think we can start with packing unpacking done explicitly in the user code (user-provided fitness() gets float-vector and calls unpack!() etc), with some utility routines (e.g. packing of a set into a subrange of a vector, or set-specific mutation and crossover operations) implemented in BBO.

robertfeldt commented 5 years ago

Ok, good, then we are on the same page. I'm thinking we can start with a (large-ish) example showing the mapping idea on a concrete problem and then we can extract/generalise from that into something we put in the package proper.

CrashBurnRepeat commented 5 years ago

I agree, mapping/pack-unpacking sounds like a good place to start and a good basis of comparison for more complicated approaches. @alyst was this something you intended to do yourself, and is there I way I can contribute if so?

robertfeldt commented 4 years ago

I added an example related to the mapping idea discussed above. To solve a TSP problem with a continuous optimization algorithm we need to map from a float vector to a permutation of integers. I tried a few from the literature but wasn't too impressed so I added a more natural one and it seems to solve at least one (42-city) TSP problem to optimality repeatedly. See:

https://github.com/robertfeldt/BlackBoxOptim.jl/blob/master/examples/tsp/bboptimize_tsp.jl

For larger problems the "random key" type of mappings (rank order based) seems better.

ludoro commented 3 years ago

Hey, I really like the library, amazing work. I am also interested in having a SearchRange with integer variables. Have there been any news on that front? Thanks a lot!

robertfeldt commented 3 years ago

No the mapping idea exemplified by the TSP optimization linked above is still the most concrete example. We are considering whether to add this to BBO core or to generalize the core into a new lib instead. Currently, it is not clear-cut.

ludoro commented 3 years ago

I see. What about the #111 though? I see that something discrete has been added, but there's no example around to show how it can be used.

robertfeldt commented 3 years ago

Yes, if you check further up this comment thread you can see @alyst commenting about #111 so there is some support as he explains. Are you mainly looking for an example of how it can be used?

ludoro commented 3 years ago

Yes, I have understood the trick of the mapping idea, but I also would love to check out how to use #111, I haven't found it in the README.md. I'd be glad to open a PR myself after I have understood it.

robertfeldt commented 3 years ago

Ok, great. @alyst maybe can comment also? I have very little time before Christmas but then plan to work on BBO during the break so hopefully can clean a few things up.