RubixML / ML

A high-level machine learning and deep learning library for the PHP language.
https://rubixml.com
MIT License
2.03k stars 183 forks source link

Is it possible to generate a deterministic model from MLP Regressor? #54

Closed xujian8313 closed 4 years ago

xujian8313 commented 4 years ago

Hi Andrew,

It is really an excellent work to build a machine learning library for PHP. A big thank you for that.

Recently I am trying to use MLP Regressor in Rubix ML for a specific set of data. However, every time I run it, I will get a different trained model, even though I have fixed the weight initializer and the bias initializer.

So my two questions:

  1. I am thinking if it is because of the holdout ratio (0.01 in the following code) that makes the machine use different portion of the data for training and validation every time I run it. Is there a way to disable the holdout ratio and use 100% of the data for training?

  2. I understand constant initialization is not a good practice, but what I was trying to do is to have something similar to random.seed() function in Python so that we can have deterministic random data. Is it possible to have this feature added?

$estimator = new MLPRegressor([
    new Dense(50, new Constant(1.0), new Constant(0.0)), 
    new Activation(new Relu()),
], 256, new Adam(0.001), 1e-4, 500, 1e-4, 10, 0.01, new LeastSquares(), new RMSE());

Thanks. J.

andrewdalpino commented 4 years ago

Hi Jian @xujian8313,

Thanks again for your great question

Since Rubix ML uses the PHP pseudo-random number generator under the hood, all you need to do to make any stochastic (random) algorithm deterministic is seed the random number generator with a known constant (such as 0) before training.

srand(0);

$estimator->train($dataset);

We use this same technique in our unit tests - see this line

With that said, initializing the weights of the neural network to a constant value will not result in a deterministic model since the underlying algorithm (Mini Batch Stochastic Gradient Descent with Backpropagation) is still random. In fact, although we make it possible in the library, initializing the weights to a constant will result in a network that is unable to learn. For an in-depth explanation as to why that is, I found this article to be helpful.

The default weight initializer for Dense layers is the He initializer which draws values from a uniform random distribution in a special way. Indeed, the distribution makes a difference depending on the network architecture (mainly the choice of activation function) which is why we provide a number of initialization strategies.

Activation Function Recommended Initializer
ReLU, ELU, SELU and other rectifiers He
Hyperbolic Tangent and Softsign Xavier 2
Sigmoid and Softmax Xavier 1

And then lastly, no there is not a way to disable progress monitoring using the holdout (validation) set in MLP Regressor currently, but you should not need to do this to make a deterministic model (see above).

Thanks again Jian and I'll be happy to answer any followup questions