DeepMLNet / DeepNet

Deep.Net machine learning framework for F#
Apache License 2.0
103 stars 9 forks source link

Cannot apply element-wise operator Add #39

Open kuroyakov opened 7 years ago

kuroyakov commented 7 years ago

I examined DeepNet sample code referenced by (http://www.deepml.net/model.html)

However, I executed my code and got System.Exception. Additional information was as follows:

cannot apply element-wise operation Add to unequal shapes ["nHidden"; "nBatch"] and ["nHidden"; "nHidden"]

The exception occurred at let hiddenAct = hiddenWeights .* input.T + hiddenBias I expect hiddenBias will be [nHidden; nBatch] shapes, but [nHidden; nHidden].

My complete code is as follows:

open Tensor
open Datasets
open SymTensor

[<EntryPoint>]
let main argv = 
    printfn "%A" argv

    let a = HostTensor.init [7L; 5L] (fun [|i; j|] -> 5.0 * float i + float j) 

    /// MINST Dataset
    let mnist = Mnist.load(__SOURCE_DIRECTORY__ + "../../MNIST") 0.0 |> TrnValTst.toHost

    printfn "MNIST training set: images have shape %A and labels have shape %A" mnist.Trn.All.Input.Shape mnist.Trn.All.Target.Shape   
    printfn "MNIST test set    : images have shape %A and labels have shape %A" mnist.Tst.All.Input.Shape mnist.Tst.All.Target.Shape

    /// Definition NeuralNetModel
    let mb = ModelBuilder<single> "NeuralNetModel"

    // Definition symbol
    let nBatch  = mb.Size "nBatch"
    let nInput  = mb.Size "nInput"
    let nClass  = mb.Size "nClass"
    let nHidden = mb.Size "nHidden"

    // Model paramaters
    let hiddenWeights = mb.Param ("hiddenWeights", [nHidden; nInput])
    let hiddenBias    = mb.Param ("hiddenBias"   , [nHidden])
    let outputWeights = mb.Param ("outputWeights", [nClass; nHidden])

    // Model variables
    let input  = mb.Var<single> "Input"  [nBatch; nInput]
    let target = mb.Var<single> "Target" [nBatch; nClass]

    // Generating model
    mb.SetSize nInput mnist.Trn.All.Input.Shape.[1]
    mb.SetSize nClass mnist.Trn.All.Target.Shape.[1]
    mb.SetSize nHidden 100L

    let mi = mb.Instantiate DevHost

    // Definition model action in input -> hidden
    let hiddenAct = hiddenWeights .* input.T + hiddenBias // <--------- Exception occurrs!!!
    let hiddenVal = tanh hiddenAct

    // Definition model action in hidden -> output
    let outputAct = outputWeights .* hiddenVal
    let classProb = exp outputAct / Expr.sumKeepingAxis 0 (exp outputAct)

    // Loss function
    let smplLoss = - Expr.sumAxis 0 (target.T * log classProb)
    let loss     = Expr.mean smplLoss

    // Compile
    let lossFn   = mi.Func loss |> arg2 input target

    // Initialization with seed
    mi.InitPars 123

    // test
    let tstLossUntrained = lossFn mnist.Tst.All.Input mnist.Tst.All.Target |> Tensor.value

    printfn "Test loss (untrained): %.4f" tstLossUntrained

    System.Console.ReadKey() |> ignore
    0 // exit code

My environment are as follows:

I'm sorry if I'm misunderstanding about your sophisticated library. Could you please let me know how to fix this problem?

MBasalla commented 7 years ago

Your problem seems to be caused by automatic broadcasting performed by the addition. The vector of size [nHidden] is automatically broadcast along the first dimension of the matrix of dimension [nHidden;nBatch] resulting in a [nHidden;nHidden] matrix.

You sould be able to solve the problem by manually specifying the dimension along which you want to broadcast the vector. Just replace

let hiddenAct = hiddenWeights .* input.T + hiddenBias with

let hiddenAct = hiddenWeights .* input.T + (Expr.reshape [nHidden;SizeSpec.broadcastable] hiddenBias)

This way a broadcasting dimension is added as the second dimension and the bias vector should now be broadcast to a matrix of the required shape.

Let me know if this worked or if you have further questions.

kuroyakov commented 7 years ago

Thanks so much for your instruction. My code worked fine!

Unfortunately, When I proceed training section, test loss was same value each iterations. I revised my code in the last few lines as follows:

    // Training
    let opt = Optimizers.GradientDescent (loss, mi.ParameterVector, DevHost)
    let optFn = mi.Func opt.Minimize |> opt.Use |> arg2 input target

    // Set learning rate
    let optCfg = {Optimizers.GradientDescent.Cfg.Step=1e-1f}
    for itr = 0 to 1000 do
        let t = optFn mnist.Trn.All.Input mnist.Trn.All.Target optCfg  
        if itr % 50 = 0 then
            let l = lossFn mnist.Tst.All.Input mnist.Tst.All.Target |> Tensor.value
            printfn "Test loss after %5d iterations: %.4f" itr l

I got the result:

Test loss after     0 iterations: 2.3019
Test loss after    50 iterations: 2.3019
...
Test loss after   950 iterations: 2.3019
Test loss after  1000 iterations: 2.3019

Please let me know what's wrong about my code.


BTW Do I need to create another issue for the further question?

surban commented 7 years ago

Hey kuroyakov,

there has been a lot of work going on in the project and unfortunately the docs and code example are a little bit behind right now. I will look over them in the next few days and come back to you as soon as possible.

Sebastian

kuroyakov commented 7 years ago

Hi surban.

Thank you for your response and I understand and am proud that you're doing. I'm looking forward to your update.