Open shilonosov opened 6 years ago
Right, so I've found an issue with my model, fixed it and seems like the problem is that some operations with sparse vectors are not supported yet.
Here is the message I am getting:
System.ApplicationException: 'Inside File: c:\agent\_work\26\s\source\math\matrix.cpp Line: 5253
Function: Microsoft::MSR::CNTK::Matrix<float>::MultiplyAndWeightedAdd -> Feature Not Implemented.
[CALL STACK]
> Microsoft::MSR::CNTK::Matrix<double>::BatchNormalizationForward<double>
- Microsoft::MSR::CNTK::Matrix<float>:: MultiplyAndWeightedAdd
- Microsoft::MSR::CNTK::TensorView<float>:: DoMatrixProductOf
- Microsoft::MSR::CNTK::TensorView<float>:: AssignMatrixProductOf
- std::enable_shared_from_this<Microsoft::MSR::CNTK::MatrixBase>:: shared_from_this (x3)
- CNTK::Internal:: UseSparseGradientAggregationInDataParallelSGD
- CNTK:: CreateTrainer
- CNTK::Trainer:: TotalNumberOfUnitsSeen
- CNTK::Trainer:: TrainMinibatch (x2)
- CSharp_CNTK_Trainer__TrainMinibatch__SWIG_2
- 00007FFDFAFAEA15 (SymFromAddr() error: The specified module could not be found.)
Is there any information on when sparse vectors operations will be supported a little bit better?
In other words - how to select a columns from matrix and perform learning on that particular column only? I thought multiplying by one hot vector must be enough, but it seems to be slow.
Actually, running it on cpu takes less time than on gpu.
Hello,
I am trying to implement GLoVe on CNTK in C# and I am facing an error when matrix is multiplied by one-hot vector. Could you please take a look and advice?
Full source code can be found here: https://github.com/shilonosov/cntk.onehot.spike/blob/master/cntk.onehot.spike/Program.cs
I am building model like this:
Error occurs when I am trying to train model:
The error is:
I suppose that is somehow related to OneHotOp call, because when I add Reshape to oneHotRow and oneHotColumn - error is gone. I don't think that is a right way to do that because model is running rally slow in this case.
For example - naive CPU implementation takes ~16 seconds on 28000 words for iteration while model above takes a few minutes on 6000 words on single epoch.
Any ideas?