Closed jbweston closed 5 years ago
originally posted by Anton Akhmerov (@anton-akhmerov) at 2017-10-27T13:30:56.002Z on GitLab
@pablopiskunow thanks for reporting the issue.
I think we have to be more specific here with what we consider failure of adaptive. The idea of adaptive is to prioritize sampling "interesting" regions over the boring ones. In your example this is exactly what happens. Further, eventually adaptive would also discover the other missing peak because it would sample the already existing peaks until they become "boring" and start to look elsewhere.
If the whole function is composed of randomly positioned sharp peaks of the same width, the optimal strategy for discovering all these peaks will always begin from sampling on a homogeneous grid that is sparse enough.
Any deviation from this strategy will discover the peaks slower.
From this point of view, loss
is merely a statement what the user prioritizes.
Therefore I don't think your example is a failure, it is an illustration that sometimes different loss functions may be better than the default one.
Eventually we aim to make it easy to specify varied sampling strategies by tweaking a learner's loss function.
I am unsure whether implementing min_area
and max_area
is a sufficiently general way of implementing such a modification, and I see some problems with it. For example it will assign the same loss to triangles with 1.01 × max_area and 100 × max_area.
I believe that in order to understand what kind of tweaking of loss we may need we will have to:
originally posted by Pablo Piskunow (@pablopiskunow) at 2017-10-27T14:04:16.023Z on GitLab
I think that for a fixed definition of loss
there will always be some kind of function that fools it, since the space of functions is immense. Therefore, we could think about the "outcome", instead of the "income", meaning, we can analise the plots that adaptive will produce, if plotting is the main purpose.
From the plotting point of view, it happens that we oversample some regions of the plot, and leave other regions unexplored.
If we use loss
as a goal
, and therefore we wait until all the necessary points are calculated, there might still be some regions unexplored.
On the other hand, if we set a max_resolution
we will not waste function evaluations in an already dense area (even if it is still not fully resolved, and therefore "interesting" for the learner). Having more points there will yield no profit from the plotting point of view, although it might be profitable if we want to integrate the function around a pole/divergence.
Along with this, if we set a min_resolution
that fixes a max_area
, there will be a bunch of areas at first, that will have an infinite loss. This will stop being the case as we let the runner
run, all the triangles with areas above max_area
, will be broken down to smaller triangles at first, therefore ensuring a minimum sampling resolution of the plot, and ensuring that all features of width above this minimum area are discovered.
originally posted by Pablo Piskunow (@pablopiskunow) at 2017-10-27T14:06:57.991Z on GitLab
edit: mixed min and max areas above
originally posted by Pablo Piskunow (@pablopiskunow) at 2017-10-27T15:37:58.584Z on GitLab
Thinking more about it, it is redundant to define a goal=lambda l: l.loss()<tol
, and a max_resolution
at the same time, since the minimum loss[1] will be given by np.sqrt(min_area)
[2]. Therefore putting tol=(2/max_resolution)**0.5
yields the same result as not specifying the max_resolution to the learner (tested).
So the only relevant parameter would be min_resolution
which shall set a max_area
. Adding this parameter can provide a reasonable way of specifying the behaviour of the learner in the two stages of "exploring" and then "refining".
The "exploring" stage will first ensure that all triangles are smaller than some defined area, and then, when the areas are all below a threshold, the "refining" stage will start, by using the dev
value inside the loss function.
Adding this feature will make us able to ensure to the user something about the results, this something is that all features bigger than a threshold will be resolved, and the degree of resolution is given by the goal
.
[1] edit: the minimum loss for the worst triangle. "Worst" meaning with dev=1
, where dev
are normalized deviations of the function from its interpolated values.
[2] edit2: for some reason, the areas are square-rooted in the current definition of loss. This choice doesn't seem to be justified.
originally posted by Bas Nijholt (@basnijholt) at 2018-01-02T14:54:07.015Z on GitLab
@pablopiskunow do you think this issue is solved by what I write in https://gitlab.kwant-project.org/qt/adaptive/merge_requests/11#note_10696?
If so, could you close this issue?
originally posted by Pablo Piskunow (@pablopiskunow) at 2018-01-03T02:38:43.023Z on GitLab
In my opinion, it would be quite useful if adaptive could ensure (by default) a minimum size of features it can discover. The user will otherwise not be aware of the possibly missing features.
That said, with the fix of using a custom loss function, the aware user is able to pass it, so this issue can be closed.
originally posted by Bas Nijholt (@basnijholt) at 2018-02-13T14:04:26.670Z on GitLab
The following two lines of code do essentially the same without complicating the interface.
def add_minimum_resolution(learner, minimum_resolution):
from collections import OrderedDict
xs = np.linspace(-0.5, 0.5, minimum_resolution)
learner._stack = OrderedDict({(x, y): np.inf for x in xs for y in xs})
learner = adaptive.Learner2D(ring, bounds=[(-1, 1), (-1, 1)])
add_minimum_resolution(learner, 40)
runner = adaptive.Runner(learner, adaptive.runner.SequentialExecutor(), goal=lambda l: l.loss() < 0.01)
originally posted by Joseph Weston (@jbweston) at 2018-02-19T17:53:51.621Z on GitLab
fixed by gitlab:!49
(original issue on GitLab)
opened by Pablo Piskunow (@pablopiskunow) at 2017-10-27T13:04:19.803Z
In this test we see how
adpative
fails to discover an all the peaks of an unknow distribution of gaussians. This failure can be overcome by setting a minimum and maximum size of the triangles.how it fails
Grab the notebook presenting
adaptive
and provide the following functionThis is a set of randomly positioned gaussians of width
sigma
. To avoid more randomness comming from paralelization, lets setand run
We get the following result
Which looks like adaptive works well! But it's a lie...
There are more points yet to discover.
how to solve it
The
max_area
parameter ensures that we do not loose small features (hidden inside a big triangle), and themin_area
parameter ensures that we do not oversample very high peaks, further than a defined plot resolution.Since plot resolution is a more intuitive way to understand the plots (rather than scaled areas of triangles), let's assing a
max_resolution = 1 / min_area
, andmin_resolution = 1 / max_area
.In the current status of
adaptive
, there are no bound on the areas, therfore is like havingmax_resolution = np.inf
, andmin_resolution = 1
, which means the whole area of the 2D plot.We put more reasonable parameters, for example, we wish to see all the features that can be discovered by a 10x10 plot,
min_resolution = 10**2
, with the resolution that could be achieved with a 50x50 plot,max_resolution=50**2
. We add this and modify the loss function withedit: swap min, max
With this parameters, we have the expected results.
which compares well with the homogeneous sampling