How about higher dimensional datasets?

vivekaxl commented 7 years ago

The speed of the algorithms do not scale well in higher dimensional dataset. All the algorithms are running as we speak. It is currently running for 12 (additional) datasets. ePAL is excruciatingly slow and I am not sure how many days it would take to run. I need to find ways to parallelize it.

Comments:

With datasets of higher dimension, ePAL is painfully slow where one run of ePAL takes hours to run For e.g. one such run (for a single epsilon) is running since 3:29 am and it is still running (current time: 10:39am). Which goes to show that your idea of trying on higher dimensional dataset was super (aka I bow before you). Where as ALX method has already finished 20 runs on 4/12 datasets. The slowest portion of the ALX is the non-dominated sort and the pairwise comparison for cdom scores. AL3 inspired solutions can speed this up.

vivekaxl commented 7 years ago

Bayesian Optimisation (BO) is a technique used in optimising a D-dimensional function which is typically expensive to evaluate. While there have been many successes for BO in low dimensions, scaling it to high dimensions has been notoriously difficult. ref

vivekaxl commented 7 years ago

To ensure that a global optimum is found, we require good coverage of X , but as the dimensionality increases, the number of evaluations needed to cover X increases exponentially. ref

timm commented 7 years ago

u running on frank's GPU environment?

epal being slow is good news for us

vivekaxl commented 7 years ago

No. I am not running the program on GPU. I am running it on VCL (one node).

vivekaxl / MOLearner

How about higher dimensional datasets? #13

Comments: