Solver research direction

originrose / cortex

Machine learning in Clojure

Eclipse Public License 1.0

1.27k stars 111 forks source link

Solver research direction #254

Open cnuernber opened 6 years ago

cnuernber commented 6 years ago

https://arxiv.org/abs/1710.09278

We demonstrate this exponential gain by comparing a sequential MatLab implementation of our solver with the winners of the 2016 Max-SAT competition on a variety of hard optimization instances. We show empirical evidence that our solver scales linearly with the size of the problem, both in time and memory, and argue that this property derives from the collective behavior of the simulated physical circuit. Our approach can be applied to other types of optimization problems and the results presented here have far-reaching consequences in many fields.

acstott commented 6 years ago

So you want to implement the approach from this publication? Or are open to any ideas which might improve solver gains?

cnuernber commented 6 years ago

Any ideas. Especially research into why Adam doesn't performs as well as SGD for some (important) problems.

acstott commented 6 years ago

I believe the choice of optimizer depends on the class of problem - its not an across the board 'this one is best' - so this is is not at all surprising. I am assuming this is to be done via the Cortex layer and not in CUDA or TensorFlow? Downloaded the paper to read.

cnuernber commented 6 years ago

It surprised a lot of very experienced practitioners in machine learning at NIPS; for a long time we were all trying to get rid of hyperparameters and there are a large set of problems where Adam and friends do provably converge faster; just not overparameterized machine learning problems. Here I think is the paper that was quite interesting:

https://arxiv.org/abs/1705.08292

cnuernber commented 6 years ago

Oh, and if you can figure out concretely why this is and fix it for hyper-parameterless optimizers then you have your Ph. D I think :-); so if I were you I wouldn't worry about cortex vs. tf.