New latin(n,d) - Githubissues

paulknysh / blackbox

A Python module for parallel optimization of expensive black-box functions

MIT License

439 stars 60 forks source link

New latin(n,d) #12

Closed lightash closed 5 years ago

lightash commented 5 years ago

Hello! I thought you might be interested in a direct way of making points. It uses low discrepancy quasirandom sequence (with seed) to fill hypercube faster than minimizing spread function and with better spread (24k vs 27k with n=300, d=22). From http://extremelearning.com.au/unreasonable-effectiveness-of-quasirandom-sequences/ Regards, Andrii

paulknysh commented 5 years ago

Hi,

This is awesome! I actually saw that article last week, but didn't think it can be related to this problem. I'm busy with few things right now, but I'm going to review your commit asap.

Have you tested if the coverage remains uniform for low number of points (say, less than 10)?

Also, does notation in the code (phi, alpha etc) follow that article?

Thanks! Paul

lightash commented 5 years ago

No, I didn't test it's coverage. You've made a very good point - I think it won't be uniform on low points (and especially also on high dimensions).

Yes, notation is the same - I basically copy-pasted the code :) Except for output, I've changed original 'z' into your 'lh'.

You're welcome, Andrii

paulknysh commented 5 years ago

So, I just tested latin hypercube (LH) against the R-sequence (in terms of spread value). R-sequence does have issues when the number of points is low (less than 10) - uniformity is worse (sometimes significantly).

However, it performs same or better as the number of points (and also number of dimensions) grows. Also, the advantage of R is that it is much simpler and faster and doesn't require any parameters.

I'll need to think about it. I kind of want to replace the LH with R now (and tell users that number of initial samples should be at least 10 or so, which is reasonable). Another option is to add branching - use LH for a few samples and R for many - but I'm not a big fan of that.

paulknysh commented 5 years ago

I decided to go with R-sequence. I simplified your code a bit. Thanks again for bringing this up.