I want to use Flux.jl to build a simple Multi-Layer Perceptron (MLP) as I did in Keras, where the input data is a matrix of nGene (number of genes) by nInd (number of individuals), output data is a vector of length nInd to represent a trait (e.g. height). I also have two hidden layers with 64, 32 neurons, respectively.
In summary, the number of neurons is changed as: nGene --> 64 --> 32 --> 1
Here X_train_t is a nGene by nInd matrix, Y_train is a vector of length nInd.
The loss is very very high, and the prediction accuracy of testing data is almost zero.
BTW, in Flux.jl, if I change the optimizer to gradient descent, it even didn't converge.
Some extra things but not helpful for my issue:
The default step size and other parameters in Flux are the same as in Keras for the Adam optimizer.
Even if the mean squared error is calculated in a different way, I don’t think it will result in such a bad prediction accuracy in Flux.jl.
In Flux.jl, the input data is a matrix of #genes by #samples. I followed the tutorial of MNIST example, where the input data is a matrix of #pixel by #samples. If I transposed the data in another way, I even cannot run the Flux code.
The elements of the input matrix are either 0 or 1, so I didn’t normalize it in Flux. And I didn’t find that Keras do normalization automatically.
I really don't know why the training process from Flux.jl is wrong, could you please tell me what's wrong with my Flulx code?
Dear all,
I want to use
Flux.jl
to build a simple Multi-Layer Perceptron (MLP) as I did in Keras, where the input data is a matrix ofnGene
(number of genes) bynInd
(number of individuals), output data is a vector of lengthnInd
to represent a trait (e.g. height). I also have two hidden layers with 64, 32 neurons, respectively.In summary, the number of neurons is changed as:
nGene
--> 64 --> 32 --> 1In Keras, the MLP is:
From below, the loss (mse) of each epoch is less than one. The prediction accuracy of testing data is about 0.6, which is good.
In Flux.jl, I built the same MLP by:
Here
X_train_t
is anGene
bynInd
matrix,Y_train
is a vector of lengthnInd
.The loss is very very high, and the prediction accuracy of testing data is almost zero.
BTW, in
Flux.jl
, if I change the optimizer to gradient descent, it even didn't converge.Some extra things but not helpful for my issue:
I really don't know why the training process from
Flux.jl
is wrong, could you please tell me what's wrong with my Flulx code?Thank you very much, Tianjing