Intermittent "Cannot resize the matrix because it is a view" problem

rhoens commented 7 years ago

When running a test run over a model, I've gotten this error twice:

Traceback (most recent call last): File "test.py", line 65, in mb = reader.next_minibatch(minibatch_size, input_map=input_map) File "/root/anaconda3/envs/cntk-py34/lib/python3.4/site-packages/cntk/utils/swig_helper.py", line 58, in wrapper result = f(*args, *kwds) File "/root/anaconda3/envs/cntk-py34/lib/python3.4/site-packages/cntk/io/init.py", line 161, in next_minibatch minibatch_size_in_samples, device) File "/root/anaconda3/envs/cntk-py34/lib/python3.4/site-packages/cntk/cntk_py.py", line 1916, in get_next_minibatch return _cntk_py.MinibatchSource_get_next_minibatch(self, args) RuntimeError: Resize: Cannot resize the matrix because it is a view.

This happened 2 invocations in a row, but running it a 3rd time seems to have "fixed" the issue. Is this known behavior?

markorakita commented 7 years ago

I am also getting this error with CNTK Library Managed API from C#. I am creating batches and running evaluation on them (using GPU device), and depending on batch size I get this error randomly at some point during evaluation.

zhouwangzw commented 7 years ago

Which CNTK version are you using? Would it be possible to share a repo for further investigation?

markorakita commented 7 years ago

Model was trained using CNTK-2-0-beta8-0-Windows-64bit-GPU-1bit-SGD and evaluation is done using latest nuget package for CNTK Library Managed API. Sure, I've attached related piece of code.

Here is full exception stack: Microsoft::MSR::CNTK::GPUMatrix::Resize: Cannot resize the matrix because it is a view. at CNTK.Function.Evaluate(UnorderedMapVariableValuePtr arguments, UnorderedMapVariableValuePtr outputs, DeviceDescriptor computeDevice) at CNTK.Function.Evaluate(Dictionary2 arguments, Dictionary2 outputs, DeviceDescriptor computeDevice) ReproCode.txt

zhouwangzw commented 7 years ago

@markorakita @rhoens We have not found any issue in your code. We doubt that it could be a bug in CNTK. Would it be possible to share a repo for further investigation? Thanks,

markorakita commented 7 years ago

@zhouwangzw I've sent you an email containing repro code + trained model + dataset. I've narrowed down what causes the exception in my case, I am calling evaluate with 16 items of size 32x32x3, but sometimes when I am at the end of dataset I call it with for example 3 items in a batch, and that causes exception to appear. Seems like bug in CNTK.

rhoens commented 7 years ago

The two options are:

1) something is maintaining a reference it shouldn't/wasn't expected. 2) someone is using Resize instead of RequireSize.

This might be a weird interaction with the python API + Math back end.

On Tue, Feb 7, 2017 at 9:43 AM, markorakita notifications@github.com wrote:

@zhouwangzw https://github.com/zhouwangzw I've sent you an email containing repro code + trained model + dataset. I've narrowed down what causes the exception in my case, I am calling evaluate with 16 items of size 32x32x3, but sometimes when I am at the end of dataset I call it with for example 3 items in a batch, and that causes exception to appear. Seems like bug in CNTK.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Microsoft/CNTK/issues/1354#issuecomment-278019401, or mute the thread https://github.com/notifications/unsubscribe-auth/AJxfH81A5vS58PpuybD1n7sEf_PAD4g9ks5raIMUgaJpZM4Lsalg .

-- --T. Ryan

RatanRSur commented 7 years ago

To those who are still seeing this, are you always sending the same minibatch size to evaluate? We found that it ours works again if all the minibatch sizes are the same (1 is what we set it to in our case)

markorakita commented 7 years ago

I really have to ask, what's the point of testing batched evaluation with batch size of 1? :)

As I said in my previous post: "I am calling evaluate with 16 items of size 32x32x3, but sometimes when I am at the end of dataset I call it with for example 3 items in a batch, and that causes exception to appear". In another words, evaluation throws exception when you call it with batch size of 16 and right after that with batch size of 3.

NeuralDip commented 7 years ago

i am getting the same error. i cannot be using always the same size of batches and using a size of only one does not really make sense. Do we all agree this is a bug? is it going to be fixed?

PeterGriffioen commented 7 years ago

I have the same problem. It occurs when you are processing batches of a certain size and then change the size to accommodate the remaining images at the end.

In my little test dataset I have 13 images (prime number). Referring to the cntk c# example, CNTKLibraryCSEvalGPUExamples of processing with EvaluationBatchOfImages, if I were to load up all 13 images, at my equivalent of the seqData.AddRange(resizedCHW) line, it works fine at the modelFunc.Evaluate(inputDataMap, outputDataMap, device) line. However if I load up 5 images then evaluate, another 5 and evaluate (fine so far) and then the last 3 images, it generates the "RuntimeError: Resize: Cannot resize the matrix because it is a view." at the evaluate. Similarly for 6x2 images and then the final image.

zhouwangzw commented 7 years ago

Thanks for reporting the issue. We are currently investigating it.

PeterGriffioen commented 7 years ago

My workaround is to pad out the batch with dummy images. Eg. if previous runs loaded 100 images at a time and then you only have 25 left in the final run, I pad out the batch to 100 images re-using the last image and then ignore the outputs of the dummy 75. It's a waste of computing resources but it works fine.

FreedMana commented 7 years ago

Ditto. My test data are thousands of speech sequences. I use HTKFeature Reader in Python API . Because i need the model output of each sentence. It's would be convenient to send in a precise number of frames in a minibatch to take care of variable-length sentences. I got the same exception when i changed mbsize in reader.next_minibatch().

zhouwangzw commented 7 years ago

The bug is fixed in 2.0RC2.

saidbleik commented 7 years ago

I've encountered the same error on {2.0rc2, gpu, lstm, adam}. It is also intermittent in my case.

while len(mb) > 0:
       trainer.train_minibatch(mb)
       mb = reader.next_minibatch(minibatch_size * avg_seq_len, input_map=input_map)

FreedMana commented 7 years ago

Hello everyone, it seems that when we use a variable-length minibatch size, it should be changed to an integer, for examlple, reader.next_minibatch( int(minibatch_size * avg_seq_len), input_map=input_map). You can try it. 发自网易邮箱大师 On 06/10/2017 22:40, Aayush Garg wrote: I have encountered this error as well on 2.0 release.

—You are receiving this because you commented.Reply to this email directly, view it on GitHub, or mute the thread.

{"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity":{"external_key":"github/Microsoft/CNTK","title":"Microsoft/CNTK","subtitle":"GitHub repository","main_image_url":"https://cloud.githubusercontent.com/assets/143418/17495839/a5054eac-5d88-11e6-95fc-7290892c7bb5.png","avatar_image_url":"https://cloud.githubusercontent.com/assets/143418/15842166/7c72db34-2c0b-11e6-9aed-b52498112777.png","action":{"name":"Open in GitHub","url":"https://github.com/Microsoft/CNTK"}},"updates":{"snippets":[{"icon":"PERSON","message":"@aayushgarg in #1354: I have encountered this error as well on 2.0 release."}],"action":{"name":"View Issue","url":"https://github.com/Microsoft/CNTK/issues/1354#issuecomment-307569117"}}}

zhouwangzw commented 7 years ago

It seems that only the bug causing the "resize" error during Evaluation has been fixed since 2.0RC2, and there is another bug causing the same intermittent error when using next_minibatch. We are investigating it.

zhouwangzw commented 7 years ago

The bug causing the "resize" error when using get_next_minbatch() has now been fixed in master, and will be included in the next binary release. It affects only Python.

zhouwangzw commented 7 years ago

I am closing the issue, and feel free to reopen it if needed.

aWeinzierl commented 6 years ago

Unfortunately I am having this problem again. I use the most recent version of the GPU package from nuget.

I configure the minibatchSource to only give one epoch of the data:

config.SetMaxSweeps(1);

Then, when I try to get minibatch number X, I get the following exception:

minibatchSource.GetNextMinibatch(128, device);

Exception:

Microsoft::MSR::CNTK::GPUSparseMatrix<ElemType>::Resize: Cannot resize the matrix because it is a view.

[CALL STACK]
    > Microsoft::MSR::CNTK::GPUSparseMatrix<float>::  Resize
    - Microsoft::MSR::CNTK::GPUSparseMatrix<float>::  RequireSizeAndAllocate
    - Microsoft::MSR::CNTK::GPUSparseMatrix<float>::  SetMatrixFromCSCFormat
    - Microsoft::MSR::CNTK::Matrix<float>::  SetMatrixFromCSCFormat
    - Microsoft::MSR::CNTK::DataTransferer::  operator=
    - Microsoft::MSR::CNTK::Matrix<float>::  __autoclassinit2
    - Microsoft::MSR::CNTK::DataTransferer::  operator= (x4)
    - Microsoft::MSR::CNTK::IDataReader::  operator= (x2)
    - Concurrency::details::_ContextCallback::  _CallInContext
    - RtlSetThreadWorkOnBehalfTicket (x2)
    - BaseThreadInitThunk

I suppose minibatch X is the last minibatch I would get from the minibatch source. Sometimes the exception is not thrown (I could retrieve all minibatches including the last one with fewer samples), but I could not figure out why.

There is never thrown an exception when the minibatch size i1. It does not matter if I choose to use a CPU device instead.

lonnes commented 6 years ago

Hello, I´m also seeing this problem intermittently. I´m using the latest version and training through python. Just for comparison, I get the following error message on a get_next_minibatch call:

RuntimeError: Microsoft::MSR::CNTK::GPUMatrix<ElemType>::Resize: Cannot resize the matrix because it is a view.

[CALL STACK]
    > Microsoft::MSR::CNTK::GPUMatrix<float>::  Resize
    - Microsoft::MSR::CNTK::GPUMatrix<float>::  SetValue
    - Microsoft::MSR::CNTK::Matrix<float>::  SetValue
    - Microsoft::MSR::CNTK::TracingGPUMemoryAllocator::  operator=
    - Microsoft::MSR::CNTK::Matrix<float>::  __autoclassinit2
    - Microsoft::MSR::CNTK::TracingGPUMemoryAllocator::  operator= (x4)
    - Microsoft::MSR::CNTK::IDataReader::  operator= (x2)
    - Concurrency::details::_ContextCallback::  _CallInContext
    - RtlReleaseSRWLockExclusive (x2)
    - BaseThreadInitThunk
    - RtlUserThreadStart

elevir commented 6 years ago

Hello! Same problem on C# when I'm reading sequences from CBF file with GetNextMinibatch method. CNTK ver. 2.4

System.ApplicationException: Microsoft::MSR::CNTK::GPUMatrix<ElemType>::Resize: Cannot resize the matrix because it is a view.

[CALL STACK]
    > Microsoft::MSR::CNTK::CudaTimer::  Stop
    - Microsoft::MSR::CNTK::GPUMatrix<float>::  Resize
    - Microsoft::MSR::CNTK::GPUMatrix<float>::  SetValue
    - Microsoft::MSR::CNTK::Matrix<float>::  SetValue
    - Microsoft::MSR::CNTK::DataTransferer::  operator=
    - std::enable_shared_from_this<Microsoft::MSR::CNTK::MatrixBase>::enable_shared_from_this<Microsoft::MSR::CNTK::MatrixBase>   (x2)
    - Microsoft::MSR::CNTK::DataTransferer::  operator=
    - std::enable_shared_from_this<Microsoft::MSR::CNTK::MatrixBase>::enable_shared_from_this<Microsoft::MSR::CNTK::MatrixBase>
    - Microsoft::MSR::CNTK::IDataReader::  operator= (x2)
    - Concurrency::details::_ContextCallback::  _CallInContext
    - RtlSetThreadWorkOnBehalfTicket (x2)
    - BaseThreadInitThunk
    - RtlUserThreadStart

elevir commented 6 years ago

I have investigated this problem. This exception arising on GetNextMinibatch after reading data from input/output dictionaries (after deprecated methods too). I applied Erase function to input/output data after reading and this fixed the problem. But I still think that it is bug. In my opinion, problem is somewhere in data allocation.

public void Test(string testDataPath, string modelPath, UInt32 minibatchSize)
{
    var reader = CreateMiniBatchSource(testDataPath, isTraining: false);

    Function model = Function.Load(modelPath, _device);
    Variable input = model.Arguments[0];
    Variable output = model.Outputs[1];

    StreamInformation inputInfo = reader.StreamInfo("features");
    StreamInformation outputInfo = reader.StreamInfo("labels");

    for (int i = 0; i < 500; ++i)
    {
        var data = reader.GetNextMinibatch(minibatchSize, _device);
        if (data == null || data.empty())
            break;
        var inputData = new Dictionary<Variable, Value>
        {
            { input, data[inputInfo].data },
        };
        Dictionary<Variable, Value> outputData = new Dictionary<Variable, Value>()
        {
            { output, null }
        };
        model.Evaluate(inputData, outputData, _device);

        var predicted = outputData[output].GetDenseData<float>(output);
        var expected = data[outputInfo].data.GetDenseData<float>(output);

         // without this, 'System.ApplicationException:Miccrosoft::MSR::CNTK::GPUMatrix<ElemType>::Resize' will arise
        outputData[output].Erase();
        data[outputInfo].data.Erase();

        var joinedResults = predicted
           .Zip(
               expected, 
               (f, s) => String.Join(";", "(" + String.Join(" ", f) + ")", "(" + String.Join(" ", s) + ")")
            );
        Console.WriteLine($"Iter {i} results:");
        Console.WriteLine(String.Join(Environment.NewLine, joinedResults));
    }
}

JanKrivanek commented 6 years ago

We are also intermitently seeing this on CNTK GPU 2.3 when calling via C# API. Relevant API:

System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation. ---> System.ApplicationException: Microsoft::MSR::CNTK::GPUMatrix<ElemType>::Resize: Cannot resize the matrix because it is a view.

[CALL STACK]
    > Microsoft::MSR::CNTK::GPUMatrix<float>::  Resize
    - Microsoft::MSR::CNTK::Matrix<float>::  Resize
    - Microsoft::MSR::CNTK::TracingGPUMemoryAllocator::  operator= (x4)
    - CNTK::Internal::  UseSparseGradientAggregationInDataParallelSGD
    - Microsoft::MSR::CNTK::TracingGPUMemoryAllocator::  operator=
    - CNTK::Internal::  UseSparseGradientAggregationInDataParallelSGD
    - CNTK::Function::  Forward
    - CNTK::Function::  Evaluate
    - CSharp_CNTK_Function__Evaluate__SWIG_0
    - 00007FF99DF67E77 (SymFromAddr() error: The specified module could not be found.)

   at CNTK.Function._Evaluate(UnorderedMapVariableValuePtr arguments, UnorderedMapVariableValuePtr outputs, DeviceDescriptor computeDevice)
   at CNTK.Function.Evaluate(IDictionary`2 inputs, IDictionary`2 outputs, Boolean createPersistentOutputValues, DeviceDescriptor computeDevice)

Will try workaround suggested by @elevir

nietras commented 6 years ago

I am getting this error consistently in CNTK 2.5.1 using the managed C# API.

System.ApplicationException
  HResult=0x80131600
  Message=Resize: Cannot resize the matrix because it is a view.

[CALL STACK]
    > Microsoft::MSR::CNTK::CPUMatrix<double>::  _rcrfTransGrdCompute
    - Microsoft::MSR::CNTK::CPUMatrix<float>::  Resize
    - Microsoft::MSR::CNTK::CPUMatrix<float>::  SetValue
    - Microsoft::MSR::CNTK::Matrix<float>::  SetValue
    - std::enable_shared_from_this<Microsoft::MSR::CNTK::MatrixBase>::  operator=
    - std::enable_shared_from_this<Microsoft::MSR::CNTK::MatrixBase>::enable_shared_from_this<Microsoft::MSR::CNTK::MatrixBase>   (x2)
    - std::enable_shared_from_this<Microsoft::MSR::CNTK::MatrixBase>::  operator=
    - std::enable_shared_from_this<Microsoft::MSR::CNTK::MatrixBase>::enable_shared_from_this<Microsoft::MSR::CNTK::MatrixBase>  
    - Microsoft::MSR::CNTK::IDataReader::  operator= (x2)
    - Concurrency::details::_ContextCallback::  _CallInContext
    - RtlAcquireSRWLockExclusive
    - RtlReleaseSRWLockExclusive
    - BaseThreadInitThunk
    - RtlUserThreadStart

  Source=Cntk.Core.Managed-2.5.1
  StackTrace:
   at CNTK.MinibatchSource.GetNextMinibatch(UInt32 minibatchSizeInSamples, DeviceDescriptor device)
   at ```

It seems to occur a the end of a data sweep during training, but without changing the batch size (in this example this is 32). But not the end of the first data sweep, but somehow at the end of every 23rd data sweep. This is pretty consistent. I can't share the data, though.

So this does not seem to have been fully resolved. Note this is running on CPU only. Not GPU.

nietras commented 6 years ago

Calling .data.Erase() on inputs and iterations on training data like @elevir commented seems to resolve the issue for me too. And I agree, this is still a bug.

nietras commented 6 years ago

Unfortunately, if using:

public class Evaluator : IDisposable
{
    public double TestMinibatch(UnorderedMapVariableMinibatchData arguments, 
        UnorderedMapVariableValuePtr outputsToFetch, DeviceDescriptor computeDevice);
}

for say validation testing (i.e. after a training epoch) this same exception occurs. Where the outputs to fetch are the actual outputs of the network and the loss. TestMinibatch only reports the "evaluation" value, which is not enough. And now calling .Erase() does not help. Which means this seems to be impossible to do now. :|

nietras commented 6 years ago

@zhouwangzw please reopen this issue.

nietras commented 6 years ago

TLDR: Erase()/Dispose() any Value instances returned. Incl. Value instances returned from .data property on e.g. MinibatchData.

Following I have isolated the following call that seems to trigger this exception:

var expectedOutputResults = targetsData.data.GetDenseData<float>(expectedOutput);

where targetsData is a minibatch loaded from CTF file with 3 element vector called targets and a mask in this file too. If running without this line, it works, with the line in, it fails. Not on first run, but second run (i.e. second full sweep). E.g. CTF file has lines like:

|targets 0 -1 0 |mask 1 0 1

The exception occurs even with:

targetsData.data.Erase();

at the end of every loop.

After discovering this it also appears outputsToFetch doesn't matter, what matters is trying to get data (via GetDenseData) after TestMinibatch is run. This fails every time.

I then inserted a DeepClone call before the GetDenseData call:

var targetsDataClone = targetsData.data.DeepClone(false);
var expectedOutputResults = targetsDataClone.GetDenseData<float>(expectedOutput);

and the exception does not occur. This got me thinking that the problem perhaps is related to .data being a SWIG generated property that probably returns a new Value instance as a view over existing data inside. And then writing:

var targetsDataValue = targetsData.data;
var expectedOutputResults = targetsDataValue.GetDenseData<float>(expectedOutput);
targetsDataValue.Erase();
targetsDataValue.Dispose();

does not cause an exception too. I assume this is due to the Value instance being erased/disposed.

This then would make me assume, that as long as any "resource" has a "read-only" view Value instance associated with it, it cannot resize. Why a "resize" to a size that is the same as the old size can cause an error due to a view existing I am not sure. Nevertheless, it seems one must always ensure that Values returned are erased/disposed inside a loop.

This problem/issue exists both for CPU and GPU.

Note that this is not necessarily deterministic, the exception does not occur always... probably more a result of lack of understanding the different sources of Value. And that there is a difference for these.

cc: @mdabros

axmand commented 5 years ago

This error also occurred under c# (v2.7.0). Here is my code：

  Parallel.For(0, 10000 , (i) =>
            {
                float[] raw = pRasterLayerCursorTool.PickRagneNormalValue( 10, 10,  9, 9);
                int cover = dqn.Predict(state);
            });

      public float[] Predict(float[] input)
        {
            using (Value inputsValue = Value.CreateBatch(inputVariable.Shape, input, device))
            {
                var inputDict = new Dictionary<Variable, Value>() { { inputVariable, inputsValue } };
                var outputDict = new Dictionary<Variable, Value>() { { classifierOutput.Output, null } };
                classifierOutput.Evaluate(inputDict, outputDict, device);
                IList<IList<float>> prdicts = outputDict[classifierOutput.Output].GetDenseData<float>(classifierOutput.Output);
                float[] result = prdicts[0].ToArray();
                return result;
            }
        }

axmand commented 5 years ago

I have solved it by lock object in Parallel,

            Parallel.For(0, 10000 , (i) =>{
                //the type of model is 'CNTK.Function'
                lock(model){
                   float[] raw = pRasterLayerCursorTool.PickRagneNormalValue( 10, 10,  9, 9);
                   int cover = model.Predict(state);
                };
            });

JanKrivanek commented 5 years ago

@axmand You are effectively executing synchronously, while possibly hijacking many threads (and so hurting performance). Just turning into symple synchronous loop would be much better.

I cannot recall if CNTK model is safe for usage in parallel, but if it is you can try to keep your Parallel loop and inside try to erase the inputDict/outputDict as suggested by @elevir If it is not - the stick with simple synchronous execution (but it can still be a good idea to cleanup the input/output Dictionaries on each eval call)

nietras commented 5 years ago

@axmand @jakrivan the following page:

https://docs.microsoft.com/en-us/cognitive-toolkit/cntk-library-evaluation-on-windows

Clearly states:

CNTK supports evaluating multiple requests in parallel. Because running evaluation on the same model instance is not thread-safe, it is required first to create multiple model instances by calling Clone() with ParameterCloningMethod.Share, and then each thread uses a separate model instance for evaluation. The EvaluateMultipleImagesInParallelAsync() demonstrates how to evaluate concurrent requests using CNTK C#/.NET Managed API.

Running in parallel if on CPU probably won't help much since the underlying code is heavily threaded anyway. Hence, as @jakrivan says you are better of not doing Parallel.For.

The problem we were seeing was not due to parallel for.

ashams1 commented 4 years ago

I have investigated this problem. This exception arising on GetNextMinibatch after reading data from input/output dictionaries (after deprecated methods too). I applied Erase function to input/output data after reading and this fixed the problem. But I still think that it is bug. In my opinion, problem is somewhere in data allocation.

public void Test(string testDataPath, string modelPath, UInt32 minibatchSize)
{
    var reader = CreateMiniBatchSource(testDataPath, isTraining: false);

    Function model = Function.Load(modelPath, _device);
    Variable input = model.Arguments[0];
    Variable output = model.Outputs[1];

    StreamInformation inputInfo = reader.StreamInfo("features");
    StreamInformation outputInfo = reader.StreamInfo("labels");

    for (int i = 0; i < 500; ++i)
    {
        var data = reader.GetNextMinibatch(minibatchSize, _device);
        if (data == null || data.empty())
            break;
        var inputData = new Dictionary<Variable, Value>
        {
            { input, data[inputInfo].data },
        };
        Dictionary<Variable, Value> outputData = new Dictionary<Variable, Value>()
        {
            { output, null }
        };
        model.Evaluate(inputData, outputData, _device);

        var predicted = outputData[output].GetDenseData<float>(output);
        var expected = data[outputInfo].data.GetDenseData<float>(output);

         // without this, 'System.ApplicationException:Miccrosoft::MSR::CNTK::GPUMatrix<ElemType>::Resize' will arise
        outputData[output].Erase();
        data[outputInfo].data.Erase();

        var joinedResults = predicted
           .Zip(
               expected, 
               (f, s) => String.Join(";", "(" + String.Join(" ", f) + ")", "(" + String.Join(" ", s) + ")")
            );
        Console.WriteLine($"Iter {i} results:");
        Console.WriteLine(String.Join(Environment.NewLine, joinedResults));
    }
}

This is what fixed this issue for me (using the C# API). I have checked the C++ code, and looks like this exception is thrown when an exclusive access to the shared_ptr pointing to the matrix in question could not be ensured. This may explain why Erase() fixes the issue. What remains to be explained is why GetDenseData would keep on holding a reference after the data has been fetched.

MihaiHoriaPopescu commented 4 years ago

I get this error as well on evaluation c# API. To make it work I need to use the "minibatchSizeInSample" equal to 1 (to evaluate each sequence) or I need to use the "minibatchSizeInSample" equal to the number of max samples (which is not always feasible for the amount of evaluation data maintained in memory).

[CALL STACK]

Microsoft::MSR::CNTK::ConvolutionEngine:: SetmMaxTempMemSizeInSamples

Microsoft::MSR::CNTK::CPUSparseMatrix:: Resize

Microsoft::MSR::CNTK::CPUSparseMatrix:: SetMatrixFromCSCFormat

Microsoft::MSR::CNTK::Matrix:: SetMatrixFromCSCFormat

CNTK::TrainingParameterSchedule:: GetMinibatchSize (x4)

Microsoft::MSR::CNTK::IDataReader:: operator= (x2)

Concurrency::details:: _Schedule_chore

RtlInitializeCriticalSection

LdrAccessResource

BaseThreadInitThunk

RtlUserThreadStart

nietras commented 4 years ago

@Pescu from our observations we concluded the problems were all related to the use of the built-in data readers in CNTK. After switching away from these to custom built ones we have not seen these issues anymore. Not a big help... unfortunately.

microsoft / CNTK

Intermittent "Cannot resize the matrix because it is a view" problem #1354