python-adaptive / adaptive

:chart_with_upwards_trend: Adaptive: parallel active learning of mathematical functions
http://adaptive.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
1.12k stars 58 forks source link

Normalize variabels #451

Open bonh opened 2 months ago

bonh commented 2 months ago

I get the feeling that normalizing the variables to be around 1 greatly improves the sampling. Specifically, it prevents the "point is inside the hull" error. Might be worth adding to the tutorial somewhere?

basnijholt commented 2 months ago

@bonh thanks for your post. Could you perhaps share some more details? Are you talking about the Learner2D? If so, the values should already be rescaled automatically.

It would be great if you could share some code!

bonh commented 2 months ago

I use LearnerND with 5 inputs ranging from $\mathcal O(1)$ to $\mathcal O(1\mathrm{e}{-6})$ mapping to one output in the range of $\mathcal O(1\mathrm{e}{-3})$.

The function is quite complex and I'm not able to share it (yet). I'll try to find a MWE.

What I noticed was, that in the nonscaled problem, the values chosen by adaptive did only vary by three or four digits right from the decimal point from one iteration to the next (until it failed).

basnijholt commented 2 months ago

At this exact moment I have no time to take a detailed look.

However, I from a quick look I am led to believe that the problem is that we're not using the value_scale parameter in the loss functions: https://github.com/python-adaptive/adaptive/blob/d2c80418b5e6055b524f5fda1b36d0f8cde40868/adaptive/learner/learnerND.py#L110-L169

This should be a relatively easy fix.

@bonh, unrelated to this issue, how is your experience with sampling a 5D space? Does Adaptive produce good results, better results than random sampling or uniform sampling? Personally, I have not even tried running real simulations beyond 3D, always thinking that "the curse of dimensionally" would bite me.

bonh commented 2 months ago

we're not using the value_scale parameter in the loss functions

That'd explain my observations, thanks!

I Just started sampling a 5D space, before that it was 3D, too. My target is to train a surrogate approximating my complex, costly function. However, the function is not that costly that I cannot sample 4000 points in a reasonable time. My guess is, that I would get similar results with different sampling procedures because I'm filling the parameter space very well. So I didn't do a detailed analysis but I think that I require about 30 % less samples with adaptive compared to uniform (this is for 3D) to get comparable predictive accuracy with the trained surrogates.