cross-pollination from/to `adaptive`

basnijholt commented 6 years ago

I am impressed with this package!

In my field we very often do these loops over multiple dimensions and generate many curves for different dimensions.

We (me and my colleagues) tried to tackle a very similar problem that xyzpy is trying to solve. We wrote adaptive that does things similar to xyzpy, the biggest difference is that it can adaptively sample one (or two) of the dimensions.

As an example I adapted your Basic Output Example to do the same but with adaptive:

import adaptive
import holoviews as hv
from functools import partial
from itertools import product
from scipy.special import eval_jacobi
import numpy as np
adaptive.notebook_extension()

def jacobi(x, n, alpha, beta):
     return eval_jacobi(n, alpha, beta, x)

combos = {
    'n': [1, 2, 4, 8, 16],
    'alpha': np.linspace(0, 2, 3),
    'beta': np.linspace(0, 1, 5),
}

def named_product(**items):
    names = items.keys()
    vals = items.values()
    return [dict(zip(names, res)) for res in product(*vals)]

learners = {}
for combo in named_product(**combos):
    learners[tuple(combo.values())] = adaptive.Learner1D(partial(jacobi, **combo), bounds=[0, 1])

balancing_learner = adaptive.BalancingLearner(list(learners.values()))

which creates "learners", which are essentially objects from which you can request new points and tell new points to.

then you "learn" the function by creating a Runner (this doesn't block the kernel and runs on all the cores, optionally you provide a excecutor to run it on a cluster)

runner = adaptive.Runner(balancing_learner, goal=lambda learner: learner.loss() < 0.01)
runner.live_info()

screen shot 2018-06-12 at 15 46 54

Then plot the data with:

balancing_learner.plot(cdims=named_product(**combos)).overlay('beta').grid()

screen shot 2018-06-12 at 15 48 42

As you can see, it is not nearly as short are your code and neither do we provide the functionality to save the data. Also the interface we have is not really optimized easily generate the combos, but this is where we can learn from xyzpy. On the other hand I think there is probably something usefull for you in adaptive too.

(P.S. this is not really an "issue", but more of a place to hopefully exchange some ideas)

EDIT Inspired on your work, I've created this PR, after which one can just do:

learner = adaptive.BalancingLearner.from_combos(
    jacobi, adaptive.Learner1D, dict(bounds=(0, 1)), combos)
runner = adaptive.BlockingRunner(learner, goal=lambda l: l.loss() < 0.01)
learner.plot(cdims=adaptive.utils.named_product(**combos)).overlay('beta').grid()

jcmgray commented 6 years ago

Adaptive looks super cool! And yes fills one more step of automation that xyzpy still lacks -- actually choosing which set of combos to run... This occurred to me at one point but I thought it was too involved so glad to see you are doing it! The dream of setting and forgetting a computer to intelligently harvest labelled data for you approaches closer.

A few questions (sorry if these are basic, I haven't had time to properly look through adaptive):

How does the learning work? Does it need e.g. scalar/smooth/float output?
Can it be batched (i.e. submit sets of points at once)?
And is there way to generalize to the n-D case?

A final thought is that the completely dynamic nature of the coordinates might become inefficient for the 'gridded' nature of xarray for many dimensions. More suitable for a sparse/table representation maybe - or I guess the starting point for interpolation. Have you had ideas in this direction?

J

By the way, your field looks like it might be quite close to mine (see my other package quimb) thus the similar ideas maybe!

akhmerov commented 6 years ago

How does the learning work? Does it need e.g. scalar/smooth/float output?

Depends on the learner's algorithm. Right now we mostly have sampling that prioritizes discontinuities in the data, but this could be controlled. I also imagine developing of specialized algorithms. For example we're now working on such an algorithm for preferential band structure sampling.

Can it be batched (i.e. submit sets of points at once)?

Yes, if the learner supports it, and currently all the algos we implemented do that.

And is there way to generalize to the n-D case?

Yes, definitely, although in higher dimensions (>3) local sampling will become bad because of the curse of dimensionality. There we'd need to think of alternative approaches.

A final thought is that the completely dynamic nature of the coordinates might become inefficient for the 'gridded' nature of xarray for many dimensions. More suitable for a sparse/table representation maybe - or I guess the starting point for interpolation. Have you had ideas in this direction?

I cannot think of anything better than storing the interpolation object in that case.

jcmgray commented 6 years ago

Nice, thanks for those answers. Don't know if I will get round to this any time soon, but from my perspective a syntax like this might be cool:

combos = {
    'A': [1, 2, 3],
    'B': ['foo', 'bar],
    't': Adaptive(bounds=(-1, 2), loss=0.05, ...)
}
h.harvest_combos(combos)

or for the 2D case:

combos = {
    'A': [1, 2, 3],
    'B': ['foo', 'bar],
    ('t', 'x'): Adaptive(bounds=[(-1, 1), (-1, 1)], loss=0.05, ...)
}
h.harvest_combos(combos)

Though each set of adaptive results would have to be aligned/interpolated to go into the full dataset.

In the other direction, xyzpy has support (not so robustly tested) for 'case running': i.e.

h.harvest_cases([{'A': 1, 'B': 'foo', 't': 0.243}, {'A': 2, 'B': 'bar', 't': 0.675}, ...])

# or if you have set xyz.Runner(..., fn_args=('A', 'B', 't'), ...)
h.harvest_cases([(1, 'foo', 0.243), (2, 'bar', 0.675), ...])

which would be the natural way for adaptive to call it currently, maybe with a hook to get the result back without extracting it from the dataset.

One more random snippet! This is one way to turn a 2D learner's data into an xarray.Dataset:

import xyzpy as xyz
from xyzpy.gen.case_runner import _cases_to_ds

fn_args = ['x', 'y']
out_name = 'out'

_cases_to_ds(
    results=tuple(learner.data.values()), 
    fn_args=fn_args, 
    cases=tuple((learner.data.keys())), 
    var_names=(out_name,),
    var_coords={},
    var_dims={out_name: []}
)

which for the first 2D example in the adaptive notebook produces:

<xarray.Dataset>
Dimensions:  (x: 886, y: 879)
Coordinates:
  * x        (x) float64 -1.0 -0.9698 -0.9643 -0.9159 -0.9122 -0.9095 ...
  * y        (y) float64 -1.0 -0.9506 -0.9193 -0.9175 -0.9168 -0.915 -0.9134 ...
Data variables:
    out      (x, y) float64 -1.0 nan nan nan nan nan nan nan nan nan nan nan ...

It's pretty inefficient though, due to the non-gridded problem.

basnijholt commented 6 years ago

That would be pretty cool indeed :)

Thanks for the suggestions on how to save the data, I've been thinking about a good way of saving (and restoring) the learners a bit lately.

So far I've just been experimenting with just pickling the data, which seems to work just fine, but I would prefer a more general data format. I'll experiment with xarray a bit more (although I am pretty busy myself as well.)

jcmgray / xyzpy

cross-pollination from/to `adaptive` #3