microsoft / CNTK

Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit
https://docs.microsoft.com/cognitive-toolkit/
Other
17.52k stars 4.28k forks source link

Matrix multiply sparse OneHot vector operation is not supported yet #3488

Open shilonosov opened 6 years ago

shilonosov commented 6 years ago

Hello,

I am trying to implement GLoVe on CNTK in C# and I am facing an error when matrix is multiplied by one-hot vector. Could you please take a look and advice?

Full source code can be found here: https://github.com/shilonosov/cntk.onehot.spike/blob/master/cntk.onehot.spike/Program.cs

I am building model like this:

        var weight = CNTKLib.ElementMin(one, CNTKLib.Pow(CNTKLib.ElementDivide(coOccurrences, xmax), alpha), "min");

        var oneHotRow = CNTKLib.OneHotOp(rows, vocabularySize, true, new Axis(0));
        var oneHotColumn = CNTKLib.OneHotOp(columns, vocabularySize, true, new Axis(0));

        var mainVector = CNTKLib.Times(mainVectors, oneHotColumn);
        var contextVector = CNTKLib.Times(contextVectors, oneHotRow);
        var mainBias = CNTKLib.Times(mainBiases, oneHotColumn);
        var contextBias = CNTKLib.Times(contextBiases, oneHotRow);

        var model = CNTKLib.ReduceSum(CNTKLib.TransposeTimes(mainVector, contextVector), new Axis(0));
        model = CNTKLib.Plus(model, mainBias);
        model = CNTKLib.Plus(model, contextBias);
        model = CNTKLib.Minus(model, CNTKLib.Log(coOccurrences));
        model = CNTKLib.Square(model);
        model = CNTKLib.ElementTimes(model, weight);

Error occurs when I am trying to train model:

        var thisBatchShape = NDShape.CreateNDShape(new[] {1});

        var cooccurenceValue = Value.CreateBatch(thisBatchShape, new float[] {0.561f}, device);
        var columnsValue = Value.CreateBatch(thisBatchShape, new float[] {1f}, device);
        var rowsValue = Value.CreateBatch(thisBatchShape, new float[] {2f}, device);

        var trainDictionary = new Dictionary<Variable, Value>
        {
            {coOccurrences, cooccurenceValue},
            {columns, columnsValue},
            {rows, rowsValue}
        };

        var parameterVector = new ParameterVector(model.Parameters().ToList());

        var learner = CNTKLib.AdamLearner(
            parameterVector,
            new TrainingParameterScheduleDouble(0.1, (uint) (vocabularySize * vocabularySize)),
            new TrainingParameterScheduleDouble(0.9, (uint) (vocabularySize * vocabularySize)),
            false);

        var learners = new LearnerVector() {learner};
        var trainer = CNTKLib.CreateTrainer(model, model, model, learners);

        trainer.TrainMinibatch(trainDictionary, false, device);

The error is:

    About to throw exception 'AsMatrix: Sparse tensors are not supported unless they are 1D or 2D matrices.'

    at CNTK.Trainer._TrainMinibatch(UnorderedMapVariableValuePtr arguments, Boolean     isSweepEndInArguments, DeviceDescriptor computeDevice)
    at CNTK.Trainer.TrainMinibatch(IDictionary`2 arguments, Boolean isSweepEndInarguments, DeviceDescriptor computeDevice)
    at cntk.onehot.spike.Program.Main(String[] args) in c:\users\hellothere\source\repos\cntk.onehot.spike\cntk.onehot.spike\Program.cs:line 82

I suppose that is somehow related to OneHotOp call, because when I add Reshape to oneHotRow and oneHotColumn - error is gone. I don't think that is a right way to do that because model is running rally slow in this case.

For example - naive CPU implementation takes ~16 seconds on 28000 words for iteration while model above takes a few minutes on 6000 words on single epoch.

Any ideas?

shilonosov commented 6 years ago

Right, so I've found an issue with my model, fixed it and seems like the problem is that some operations with sparse vectors are not supported yet.

Here is the message I am getting:

System.ApplicationException: 'Inside File: c:\agent\_work\26\s\source\math\matrix.cpp  Line: 5253  
Function: Microsoft::MSR::CNTK::Matrix<float>::MultiplyAndWeightedAdd  -> Feature Not Implemented.

[CALL STACK]
    > Microsoft::MSR::CNTK::Matrix<double>::BatchNormalizationForward<double>  
    - Microsoft::MSR::CNTK::Matrix<float>::  MultiplyAndWeightedAdd
    - Microsoft::MSR::CNTK::TensorView<float>::  DoMatrixProductOf
    - Microsoft::MSR::CNTK::TensorView<float>::  AssignMatrixProductOf
    - std::enable_shared_from_this<Microsoft::MSR::CNTK::MatrixBase>::  shared_from_this (x3)
    - CNTK::Internal::  UseSparseGradientAggregationInDataParallelSGD
    - CNTK::  CreateTrainer
    - CNTK::Trainer::  TotalNumberOfUnitsSeen
    - CNTK::Trainer::  TrainMinibatch (x2)
    - CSharp_CNTK_Trainer__TrainMinibatch__SWIG_2
    - 00007FFDFAFAEA15 (SymFromAddr() error: The specified module could not be found.)

Is there any information on when sparse vectors operations will be supported a little bit better?

shilonosov commented 5 years ago

In other words - how to select a columns from matrix and perform learning on that particular column only? I thought multiplying by one hot vector must be enough, but it seems to be slow.

Actually, running it on cpu takes less time than on gpu.