fccoelho / bayesian-inference

Previously hosted on code.google.com/p/bayesian-inference
6 stars 2 forks source link

Latin Hypercube Sampling doesn't work when there are more parameters than size of samples #1

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Call lhs with params > size

What is the expected output? What do you see instead?

Expected: The output matrix of samples
See:...

File "/usr/local/lib/python2.6/dist-packages/BIP/Bayes/lhs.py", line 81, in lhs
    indices=rank_restr(nvars=len(dists), smp=siz, noCorrRestr=noCorrRestr, Corrmat=corrmat)
  File "/usr/local/lib/python2.6/dist-packages/BIP/Bayes/lhs.py", line 136, in rank_restr
    Q=cholesky(numpy.corrcoef(S))
  File "/usr/lib/python2.6/dist-packages/numpy/linalg/linalg.py", line 423, in cholesky
    Cholesky decomposition cannot be computed'
numpy.linalg.linalg.LinAlgError: Matrix is not positive definite -         
Cholesky decomposition cannot be computed

What version of the product are you using? On what operating system?
Python 2.65 
NumPy 1.3
SciPy 0.7.0-2
Ubuntu 10.04
BIP 0.5.0

Please provide any additional information below.
I'm trying to create a sample size of 40, where each sample has 256 parameters. 
Should this be possible? I would imagine so. I am running a large scale 
optimization problem.

Original issue reported on code.google.com by nick.man...@gmail.com on 25 Jul 2010 at 8:03

GoogleCodeExporter commented 9 years ago
Could you post the exact code you are trying to run? This way I can check and 
see what is going on.

thanks,

Flávio

Original comment by fccoelho on 27 Jul 2010 at 1:55

GoogleCodeExporter commented 9 years ago
dim = len(func.maxs)
params = [(func.mins[i], func.maxs[i] - func.mins[i]) for i in xrange(dim)]
seeds = np.array(lhs([uniform]*dim, params, size, False, np.identity(dim))).T

dim is the number of parameters. params is a list of len(dim) that is (min, 
max) tuple.

Original comment by nick.man...@gmail.com on 27 Jul 2010 at 2:52

GoogleCodeExporter commented 9 years ago
I am looking into this. On the bright side, it's not a bug in my code but a 
limitation of the method, meaning that you can't do a cholesky decomposition on 
a matrix which is not positive-definite.

However, that does not solve the problem that certain combinations of number of 
parameters and sample size seem to yield these kinds of matrices....(not only 
when params > size).

Anyway, Until I can figure out how to avoid this, assuming it is avoidable, I 
suggest trying some workarounds:
1 - Generate larger samples, then throw away the samples you don't need.
2 - If uncorrelated variables are not strictly necessary, set noCorrRestr to 
True. 

By the way on uniform distributions (scipy.stats.uniform) the parameter are 
(min,range) not (min,max).

if You have any suggestions on how to fix this, please let me know

Original comment by fccoelho on 28 Jul 2010 at 3:30

GoogleCodeExporter commented 9 years ago
Thanks, I'll set noCorrRestr to True for now. Also, I realized that was a type. 
My params list is (min, range) if you notice in my last post, the 2nd variable 
for the tuple is max-min.

Thanks again.

Original comment by nick.man...@gmail.com on 28 Jul 2010 at 4:17

GoogleCodeExporter commented 9 years ago
Can't findo a solution to this. If someone has a solution and want to 
contributed please reopen this issue

Original comment by fccoelho on 26 Aug 2010 at 5:40