jdermody / brightwire

Bright Wire is an open source machine learning library for .NET with GPU support (via CUDA)
https://github.com/jdermody/brightwire/wiki
MIT License
125 stars 19 forks source link

IndexOutOfRangeException #44

Open dominicp1973 opened 3 years ago

dominicp1973 commented 3 years ago

In sample function TrainConvolutionalNeuralNetwork(), exception System.IndexOutOfRangeException: 'Index was outside the bounds of the array.' is returned on this call: engine.Train(numIterations, testData, model => { bestGraph = model.Graph; });

This is using the provided sample code without changes.

This reproduces both using .NET 5.0 and .NET Core 3.1 compilation.

Additional issues:

  1. There is no package called Install-Package BrightWire.Numerics
  2. It's therefore not possible to call UseNumericsLinearAlgebra()
dominicp1973 commented 3 years ago

I should add that TrainingFeedForwardNeuralNetwork() is working perfectly well so only TrainConvolutionalNeuralNetwork() is broken. I am using version 3.0.1.

dominicp1973 commented 3 years ago

This must be an issue with NuGet package assemblies. The error does NOT reproduce if I compile latest source code from BrightWire. To repro, try downloading the NuGet package, and then running the sample code TrainConvolutionalNeuralNetwork (referencing binaries from NuGet). Sadly, this also means giving you an exact call stack is difficult as the NuGet package does not have PDBs.

jdermody commented 3 years ago

I'll take a look at the IndexOutOfRangeException but the correct nuget package should be:

Install-Package BrightData.Numerics

This has been corrected in the BrightWire github readme

dominicp1973 commented 3 years ago

Thank you Jack!

Actually, it can reproduce the issue.

The call stack is:

MathNet.Numerics.MK.dll!BrightData.Memory.TensorSegment.this[uint].set(uint index, float value) Line 38 C# MathNet.Numerics.MK.dll!BrightData.LinearAlgebra.Tensor3D.this[uint, uint, uint].set(uint depth, uint rowY, uint columnX, float value) Line 62 C# MathNet.Numerics.MK.dll!BrightData.LinearAlgebra.FloatTensor.Float3DTensor.AddPadding(uint padding) Line 37 C# MathNet.Numerics.MK.dll!BrightData.LinearAlgebra.FloatTensor.Float4DTensor.AddPadding.AnonymousMethod__0(BrightData.IIndexable3DFloatTensor t) Line 47 C# [External Code] MathNet.Numerics.MK.dll!BrightData.LinearAlgebra.FloatTensor.Float4DTensor.AddPadding(uint padding) Line 46 C# BrightWire.dll!BrightWire.ExecutionGraph.Node.Layer.Convolutional.ForwardSingleStep(BrightWire.IGraphData signal, uint channel, BrightWire.IGraphSequenceContext context, BrightWire.ExecutionGraph.Node.NodeBase source) Line 130 C# BrightWire.dll!BrightWire.ExecutionGraph.Node.NodeBase.Forward(System.Threading.CancellationToken ct, BrightWire.IGraphData signal, BrightWire.IGraphSequenceContext context, uint channel, BrightWire.ExecutionGraph.Node.NodeBase prev) Line 93 C# BrightWire.dll!BrightWire.ExecutionGraph.Node.NodeBase.Forward(System.Threading.CancellationToken ct, BrightWire.IGraphData signal, BrightWire.IGraphSequenceContext context, uint channel, BrightWire.ExecutionGraph.Node.NodeBase prev) Line 106 C# BrightWire.dll!BrightWire.ExecutionGraph.Engine.TrainingEngine.Train(BrightWire.IGraphExecutionContext executionContext, BrightWire.ILearningContext learningContext, BrightWire.IMiniBatchSequence sequence) Line 166 C# BrightWire.dll!BrightWire.ExecutionGraph.Engine.TrainingEngine.Train(BrightWire.IGraphExecutionContext executionContext, BrightWire.ILearningContext learningContext, BrightWire.IMiniBatch batch) Line 132 C# BrightWire.dll!BrightWire.ExecutionGraph.Engine.TrainingEngine.Execute(BrightWire.IDataSource dataSource, uint batchSize, System.Action batchCompleteCallback) Line 58 C#

The exception originates from class TensorSegment on this setter: public T this[uint index] { get => _data[index]; set => _data[index] = value; }

The index is out of range for _data. The size of the array is 1024, but the index is much higher.

Not calling context.UseNumericsLinearAlgebra() causes this issue and is the root cause.

This was the only call I could not make due to the missing NuGet package (your comment above).

So a check that UseNumericsLinearAlgebra() was called (or the CUDA version) may be the right simple fix.

PS: nice library and I like the way it's setup!