microsoft / CNTK

Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit
https://docs.microsoft.com/cognitive-toolkit/
Other
17.51k stars 4.28k forks source link

Can SparseInput be given to a convolution layer? #2389

Open rsharnag opened 7 years ago

rsharnag commented 7 years ago

Facing the following issue while running a convolution network with sparse input layer.

File: Matrix.cpp Line: 4659 Function: Microsoft::MSR::CNTK::Matrix::UnrollConvolutionInput -> Feature Not Implemented. Traceback (most recent call last): File "D:/git_repos/RnR-Intl-AU/Projects/CDSSM/CNTK/cdssm_cntk.py", line 318, in dsm_net(args) File "D:/git_repos/RnR-Intl-AU/Projects/CDSSM/CNTK/cdssm_cntk.py", line 223, in dsm_net callbacks=callbacks) File "C:\Anaconda3\envs\cntk-py36\lib\site-packages\cntk\ops\functions.py", line 1377, in train ts.train() File "C:\Anaconda3\envs\cntk-py36\lib\site-packages\cntk\internal\swig_helper.py", line 69, in wrapper result = f(*args, **kwds) File "C:\Anaconda3\envs\cntk-py36\lib\site-packages\cntk\train\training_session.py", line 276, in train super(TrainingSession, self).train(device) File "C:\Anaconda3\envs\cntk-py36\lib\site-packages\cntk\cntk_py.py", line 3312, in train return _cntk_py.TrainingSession_train(self, computeDevice) RuntimeError: Inside File: Matrix.cpp Line: 4659 Function: Microsoft::MSR::CNTK::Matrix::UnrollConvolutionInput -> Feature Not Implemented.

[CALL STACK]

Microsoft::MSR::CNTK::Matrix:: UnrollConvolutionInput

  • Microsoft::MSR::CNTK::ConvolutionEngine:: Forward
  • Microsoft::MSR::CNTK::DataTransferer:: operator= (x2)
  • CNTK::Internal:: UseSparseGradientAggregationInDataParallelSGD
  • Microsoft::MSR::CNTK::DataTransferer:: operator=
  • CNTK::Internal:: UseSparseGradientAggregationInDataParallelSGD
  • CNTK::Function:: Forward
  • CNTK:: CreateTrainer
  • CNTK::Trainer:: TotalNumberOfSamplesSeen
  • CNTK::Trainer:: TrainMinibatch
  • CNTK::TrainingSession:: Train
  • PyInit__cntk_py
  • PyCFunction_FastCallDict
  • PyCFunction_FastCallKeywords
  • PyEval_GetFuncDesc

Network definition :

    inputQ = input_variable(shape=(1, input_dim * vocab_size), name="query", is_sparse=True)
    # convolution layer with relu
    conv_layer = Convolution1D(
        num_filters=out_feature_map_count,
        filter_shape=kernelh * kernelw,
        strides=kernelw,
        name="Conv_layer",
        init=glorot_uniform(),
        activation=relu, bias=True)(inputQ)
    # max pooling layer with
    layer = MaxPooling(
        filter_shape=(1, stride_size), name="Max_pool_layer")(conv_layer)
    layer = reshape(layer, (out_feature_map_count))
    layer = fully_connected_layer(layer.shape[-1], hidden_layer_dims, relu)(layer)
    layer = l2_normalize(layer, -1)
sayanpa commented 7 years ago

This is a known feature gap largely due to the current lack of sparse support from cuDNN. The work around is to convert your input to dense.