vinhkhuc / lbfgs4j

Java version of liblbfgs: http://www.chokkan.org/software/liblbfgs/
MIT License
13 stars 1 forks source link

Understanding how to use LBFGS #1

Closed juicyslew closed 8 years ago

juicyslew commented 8 years ago

Dear Mr. Khuc,

I am a high school senior transitioning to college (graduated yesterday). I am trying to implement LBFGS into an application I am making. I understand the basic math behind Neural Networks and how to perform backpropagation and receive the error for all the weights. I am currently working with a friend who is handling the Java and organization part of the application, while I am doing the Neural Network and Mathy part of it. I have been reading through this code, but as I don't understand how to use Java very well yet, and dont understand the actual algorithm of LBFGS, I am at a loss of how to use my Original data, cost function, weights, and backpropagation error of these weights in your function in order to produce results that allow for minimizing the cost function and creating a good predicting device for what I wish to predict. If you could shine light on how to do so, I would be very grateful.

TL;DR: How do I use my original data, cost function, weights of a neural network, and error of these weights to find an optimization of the neural network using your LBFGS function. Where do I put in these parameters and in what format?

A student, William Derksen

vinhkhuc commented 8 years ago

Hi William, In order to use lbfgs4j to minimize a function, you need to provide its value and its corresponding gradient as shown in the usage example https://github.com/vinhkhuc/lbfgs4j#usage-example.

For neural networks, the function you want to minimize is the cost function which is calculated during forward propagation stage. The cost function's gradient for each weight vector is calculated during backpropagation stage.

However, I'm afraid that the current interface of lbfgs4j is not flexible enough to be used to train neural networks since the interface assumes that there is only one weight vector x.

A workaround is combining neural network's weight vectors into a long weight vector x. A marker vector is created to keep track of where each weight vectors starts.

Another problem with using LBFGS to train neural networks is that it computes over the whole training data set. Minibatch LBFGS is not supported in lbfgs4j.

Therefore, I would recommend trying to implement neural networks using SGD first.

juicyslew commented 8 years ago

Oh lol! I glimpsed over the usage example and went straight to trying to read the code directly. Thank you very much! That was a very helpful answer. Luckily my use won't require minibatch LBFGS and also I already have it working with Standard Gradient Descent (what I assume SGD means) Thank you so much!!!

vinhkhuc commented 8 years ago

You're welcome, William.

juicyslew commented 8 years ago

One more question: I'm currently using blueJ to test my code. It is working fine currently, now I am trying to implement this library, and I know how to write the library is added, however I cannot seem to figure out how to import the library when not in a .jar format. Do you have a version of this project in a .jar file?

Thank you very much!

William Derksen.

vinhkhuc commented 8 years ago

William, If you use Maven, you can add this into your pom.xml: https://github.com/vinhkhuc/lbfgs4j#maven-dependency.

If not, you can get the jar file from http://search.maven.org/#search%7Cga%7C1%7Clbfgs4j

juicyslew commented 8 years ago

Thank you very much! This will be a great help to my partner and I :)

juicyslew commented 8 years ago

One, hopefully last, question:

How do we change the parameters of LBFGS? Specifically how can I change the max iterations?

vinhkhuc commented 8 years ago

Sorry for the late response, I've traveled in the last couple of days. In order to change the max iterations, you could do as follows:

LBFGS_Param param = Lbfgs.defaultParams(); param.max_iterations = 100; // set max iterations LbfgsMinimizer minimizer = new LbfgsMinimizer(param);