basis - Githubissues

marcrocasalonso commented 6 years ago

Dear Jonathan,

Suppose that I have a particular case with three random inputs with Gaussian behavior. Then, the appropiated orthogonal basis selected to these inputs are Hermite Polynomials. If I want to probe another basis as for example Legenedre orthogonal polynomial basis for the same example. Which function from Chaospy I have to implement? Because I have not found any function to generate Legenedre Polynomials to test it. I want to implement another basis because my QoI output is a bimodal PDF. When I implement regression Polynomial Chaos with Hermite basis can not capture this bimodality behavior even if I use high order in polynomials. My idea is test the same example with a Legendre Basis because the weigth function in this case is one, and maybe it could describe better this bimodality.

What do you think?

Thank you very much! :)

Marc

jonathf commented 6 years ago

(Probability) Hermite Polynomials is generated from three terms recursion of the standard Gaussian distribution:

polynomials = cp.orth_ttr(order, cp.Normal(0, 1))

Legendre polynomials is generated the same way, but for the uniform distribution on the (-1, 1) interval:

polynomials = cp.orth_ttr(order, cp.Uniform(-1, 1))

If you want to create a orthogonal expansion for a new custom distribution, you will first have to define a new distribution. This can be done as follows:

class BiModel(cp.Dist):

  def _cdf(x, **prm):
    # return cumulative distribution of values x

  def _bnd(**prm):
    # return feasable range of distribution (I use (-7.5, 7.5) for standard Gaussian)

  def _pdf(x, **prm):
    # return probability density of values x (optional, but helps a lot on accuracy)

distribution = BiModal(**prm)
polynomials = cp.orth_ttr(order, distribution)

jonathf commented 6 years ago

Oh, right. If you need multivariate distribution with tree variables, just do:

distribution = cp.Iid(BiModel(**prm), 3)

Same for Hermite and Legendre.

marcrocasalonso commented 6 years ago

Imagine that you have the same problem as above. Three Gaussian random inputs, with Hermite basis that give you a bimodal Output response:

Stochastic dimension=3 Behavior of the inputs=Gaussian Polynomial basis=Hermite Reponse=bimodal behavior

Now, I want to implement the same but with Legendre basis because the bimodality in the above case it does not capture well:

Stochastic dimension=3 Behavior of the inputs=Gaussian Polynomial basis=Legendre Reponse=? I hope to capture better the bimodality with legendre

jonathf commented 6 years ago

If you want to described Gaussian random input using Legedre polynomials (which may or may not help), you are getting into generalized polynomial chaos territory.

Perhaps something like this:

dist1 = cp.Iid(cp.Normal(0,1), 3)
dist2 = cp.Iid(cp.Uniform(-1, 1), 3)

samples1 = dist1.samples(size)
samples2 = dist2.inv(dist1.fwd(samples1))

polynomials = cp.orth_ttr(order, dist2)
evals = [model_evaluator(*sample) for sample in samples1.T]

approx_model = cp.fit_regression(polynomials, samples2, evals)

Advantage here is that you are trying to model underlying dependency structure using Legendre variables. If that assumption holds, it will increase performance. There are no guaranties though. It might even get worse as you are adding a level of complexity here.

marcrocasalonso commented 6 years ago

Thank you Jonhatan! I will try it and test if it get better or worse!:)

jonathf commented 6 years ago

The function "approx_model" is a fancy random variable, that is 100 % described as a function of dist1. The input varies on (-inf,inf), because Gaussian. If you change dist1, then you also change the behavior of approx_model.

But what you can do is to extend the dependency one step further. Instead of a approx_model(dist1), you can instead parameterize it as approx_model(dist1(dist2)). In other words, you do the modelling twice:

dist1 is modelled as a function of dist2: dist1(dist2)
approx_model is modelled as a function of dist2: approx_model(dist1) (as before).

The dist2.inv(dist1.fwd(samples)) maps samples in the space defined by dist1 and onto the space defined by dist2.

Why do we need to do this? Well, consider if dist2 was cp.Uniform(100000, 100001). It is well defined, but your samples generated from dist1 is not really relevant for the resulting orthogonal polynomial which is optimized for a completely different interval. In fact, even if the intervals are similar, as is in your case, distributing the samples correctly in the space, ensures the probability space is covered proportionally to the likelihood. E.g. Gaussian samples have more samples close to 0, where the probability is also higher.

That being said, Uniform(-1, 1) and Normal(0, 1) have their main weight around the same interval. Your code likely works just fine for your example. I always include the mapping perhaps as a force of habit that it is the correct way of generalizing distribution mapping.

marcrocasalonso commented 6 years ago

Perfect, much thanks! So clear your explanations. I have tested and it works worse when a Legendre basis are used for three gaussian inputs and a bimodal Output.

jonathf / chaospy

basis #114