SciSharp / SiaNet

An easy to use C# deep learning library with CUDA/OpenCL support
https://scisharp.github.io/SiaNet
MIT License
380 stars 83 forks source link

CrossEntropy issue on LSTMs #25

Closed nikosdim1 closed 5 years ago

nikosdim1 commented 6 years ago

I think LSTM cannot work on Classification problems on current version but only on Regression problems. Tried to convert sample code to do Multi Classification but got rising error "Operation 'TransposeTimes" at

private static Function CrossEntropy(Variable labels, Variable predictions) { return CNTKLib.CrossEntropyWithSoftmax(predictions, labels); }

deepakkumar1984 commented 6 years ago

I will check, probably issue with CNTK and need to ask the team

nikosdim1 commented 6 years ago

Hi Deepak, any news for this issue?

deepakkumar1984 commented 6 years ago

Not yet mate. Even cntk team is also quite. Not much response from the core team

On Thu, 18 Jan 2018 at 6:51 pm, Nick Voutouras notifications@github.com wrote:

Hi Deepak, any news for this issue?

— You are receiving this because you were assigned.

Reply to this email directly, view it on GitHub https://github.com/deepakkumar1984/SiaNet/issues/25#issuecomment-358571683, or mute the thread https://github.com/notifications/unsubscribe-auth/AGCQKckhfPNqeZhcI4iB8aUqm8hTqjnWks5tLv7xgaJpZM4RVnNd .

-- Regards, Deepak

deepakkumar1984 commented 6 years ago

Can you post the sample code for me to check?

nikosdim1 commented 6 years ago
    Dim Rnd As New Random
    For cc As Integer = 0 To 99
        Dim sngLst(99) As Single
        For indx As Integer = 0 To 99
            sngLst(indx) = Rnd.NextDouble()
        Next
        trnX_fin.Add(sngLst)
    Next

    For cc As Integer = 0 To 99
        Dim sngLst(2) As Single
        ' fake one hot just for check
        sngLst(0) = 0 : sngLst(1) = 1 : sngLst(2) = 0
        trnY_fin.Add(sngLst)
    Next

    'add XYframe
    Dim XYfrm As New XYFrame
    XYfrm.XFrame = trnX_fin
    XYfrm.YFrame = trnY_fin
    ' Split
    TrainTestfrm = XYfrm.SplitTrainTest(0.3)

    ' init some values
    shape_of_input = XYfrm.XFrame.Shape(1)
    Dim embval As Integer = 100
    Dim seed As Integer = 2

    model = New Sequential()

    model.Add(New Reshape(targetshape:=Shape.Create(1, embval), shape:=Shape.Create(shape_of_input)))

    model.Add(New LSTM([dim]:=64, returnSequence:=False, weightInitializer:=New Model.Initializers.GlorotUniform(0.05, seed), biasInitializer:=New Model.Initializers.GlorotUniform(0.05, seed), recurrentInitializer:=New Model.Initializers.GlorotUniform(0.05, seed)))
    model.Add(New Dense([dim]:=3, act:="sigmoid", useBias:=True, weightInitializer:=New Model.Initializers.GlorotUniform(0.05, seed)))

    AddHandler model.OnEpochEnd, AddressOf Model_OnEpochEnd
    AddHandler model.OnTrainingEnd, AddressOf Model_OnTrainingEnd

    model.Compile(OptOptimizers.Adam, OptLosses.SparseCrossEntropy, OptMetrics.Accuracy)
    model.Train(TrainTestfrm.Train, 200, 8, TrainTestfrm.Test)
nikosdim1 commented 6 years ago

Error comes on Train @ SparseCrossEntropy method.

If I use "softmax" on last Dense, MeanSquaredError as the Loss function and Optmetrics.Mse it seems to work but don't seems the results to be OK.

Something happening to the difference of Axis in features & labels on SparseCrossEntropy. Same error on CrossEntropy too

deepakkumar1984 commented 6 years ago

Ok I will check with the sample you gave. Will update you today

On Tue, Jan 23, 2018 at 7:47 AM, Nick Voutouras notifications@github.com wrote:

Error comes on Train @ SparseCrossEntropy method.

If I use "softmax" on last Dense, MeanSquaredError as the Loss function and Optmetrics.Mse it seems to work but don't seems the results to be OK.

Something happening to the difference of Axis in features & labels on SparseCrossEntropy. Same error on CrossEntropy too

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/deepakkumar1984/SiaNet/issues/25#issuecomment-359567863, or mute the thread https://github.com/notifications/unsubscribe-auth/AGCQKVxb_RaBPl8k-7yKZX4lUidVW_Pyks5tNPsEgaJpZM4RVnNd .

-- Regards, Deepak

nikosdim1 commented 6 years ago

just remove the 2 AddHandler as I copied them by error :)

nikosdim1 commented 6 years ago

Also if I use ReturnSequence = True it seems to work but again its not the expected use of LSTMs a sequence example (need the thoughtvector = last_sequence)

deepakkumar1984 commented 6 years ago

Hi Nick,

I checked your code and the return sequence will bring the last output (thought vector) of the in the output sequence. Seems like this should work. Do you need anything else?

-

On Tue, Jan 23, 2018 at 8:29 AM, Nick Voutouras notifications@github.com wrote:

Also if I use ReturnSequence = True it seems to work but again its not the expected use of LSTMs a sequence example (need the thoughtvector = last_sequence)

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/deepakkumar1984/SiaNet/issues/25#issuecomment-359581495, or mute the thread https://github.com/notifications/unsubscribe-auth/AGCQKcC1LISeCFBoTdNF7oxZIK0V-U6fks5tNQSvgaJpZM4RVnNd .

-- Regards, Deepak

deepakkumar1984 commented 5 years ago

Closing it off as there is no activity in last 180 days