campa-consortium / generator_standard

Standardization for generators used in optimas, Xopt, libEnsemble...
5 stars 3 forks source link

Multi-dimensional data #14

Open shuds13 opened 3 months ago

shuds13 commented 3 months ago

Should there be any convention for multi-dimensional data.

If our data structure is a list of dictionaries, each dictionary entry could be multi-dimensional.

If libEnsemble is to have a wrapper to convert numpy structured arrays to list of dictionaries, what do we convert a 2d x value to. It could be a numpy array or should it be some native python type like a list/tuple.

shuds13 commented 3 months ago

Optimas seems to avoid by making each dimension a separate entry.

jlnav commented 3 days ago

Basically as discussed, it would be great if multi-dimensional variables were supported.

We have a gen that has the following potential variables:


n_x = 5  # No. of x values
nparams = 4  # No. of theta params
ndims = 3  # No. of x coordinates.

variables = {}
for i in range(ndims):
    variables["x" + str(i)] = [0, 1]
for i in range(nparams):
    variables["theta" + str(i)] = [0, 1]
for i in range(n_x):
    variables["obs" + str(i)] = [0, 1]
    variables["obsvar" + str(i)] = [0, 1]

Producing:

ipdb> variables
{'x0': [0, 1], 'x1': [0, 1], 'x2': [0, 1], 'theta0': [0, 1], 'theta1': [0, 1], 'theta2': [0, 1], 'theta3': [0, 1], 'obs0': [0, 1], 'obsvar0': [0, 1], 'obs1': [0, 1], 'obsvar1': [0, 1], 'obs2': [0, 1], 'obsvar2': [0, 1], 'obs3': [0, 1], 'obsvar3': [0, 1], 'obs4': [0, 1], 'obsvar4': [0, 1]}

In my opinion there are potentially several more succinct ways of specifying these variables:


# 1:

variables = {
    "x": (2, [0, 1]),
    "theta": (4, [0, 1]),
    "obs": (5, [0, 1]),
    "obsvar": (5, [0, 1]),
}

# or #2:

variables = [
    {"name": "x", "size": 2, "bounds": [0, 1]},
    {"name": "theta", "size": 4, "bounds": [0, 1]},
    {"name": "obs", "size": 5, "bounds": [0, 1]},
    {"name": "obsvar", "size": 5, "bounds": [0, 1]},
]

# or #3:

variables = {
    "names": ["x", "theta", "obs", "obsvar"],
    "sizes": [2, 4, 5, 5],
    "bounds": [[0, 1], [0, 1], [0, 1], [0, 1]]
}

I like 2 the most. It's list-of-dicts like a point, and you can iterate over entries instead of iterating over the dict's keys.