Active Learning: search space becomes too large

Rykath commented 2 years ago

Configuring a moderately large number (5-10) of input variables for Active Learning will fail as the search space no longer fits into memory. Required by: MEPHIT (#174)

SimpleAL uses a meshgrid over all AL-inputs as search space. The required space scales with nsearch^ninputs.

Workaround:

do a dimensionality reduction on the input variables to reduce them to a lower number

Possible Solutions

Acquisition functions use a loss or utility function and select the maximum/minimum based on the surrogate predictions for all points within the search space.

different algorithm to search for the Utility maximum (conjugate gradient, simulated annealing, etc.)
- implemented as a new component
- implemented as an alternative to SimpleAL
choose a large, but fixed number of points for the search space (e.g. space-filling with Halton)
- easiest to implement as only Xpred has to be modified

At this point the question also arises whether the structure of Active Learning / acquisition functions should be refactored to simplify the API? Which changes are necessary to solve this issue?

Rykath commented 2 years ago

While looking into this issue I noticed that the Active Learning ignores the distinction between regular Input variables and ActiveLearning variables during the learn step.

Rykath commented 1 year ago

Workaround (in progress): specifying searchtype: halton fixes the number of points to nsearch

active_learning:
    nwarmup: 5
    nsearch: 1000
    algorithm:
        class: simple
        searchtype: halton
        acquisition_function:
            class: simple_exploration

in comparison the default searchtype: grid uses nsearch: 50 points per dimension

active_learning:
    nwarmup: 5
    nsearch: 50
    algorithm:
        class: simple
        searchtype: grid
        acquisition_function:
            class: simple_exploration

redmod-team / profit

Active Learning: search space becomes too large #175

Workaround:

Possible Solutions