zhongkaifu / RNNSharp

RNNSharp is a toolkit of deep recurrent neural network which is widely used for many different kinds of tasks, such as sequence labeling, sequence-to-sequence and so on. It's written by C# language and based on .NET framework 4.6 or above versions. RNNSharp supports many different types of networks, such as forward and bi-directional network, sequence-to-sequence network, and different types of layers, such as LSTM, Softmax, sampled Softmax and others.
BSD 3-Clause "New" or "Revised" License
285 stars 91 forks source link

IndexOutOfRangeException #16

Closed ppsunrise closed 7 years ago

ppsunrise commented 8 years ago

Hi, Zhong! Thank you for your lib, your work looks really cool. I've successfully run the demo, but when I change my own corpus for NER, there some errors happen as follows:


未经处理的异常: System.IndexOutOfRangeException: 索引超出了数组界限。 在 RNNSharp.RNN.ForwardBackward(Int32 numStates, Double[][] m_RawOutput) 在 RNNSharp.RNN.learnSentenceForRNNCRF(Sequence pSequence) 在 RNNSharp.RNN.TrainNet() 在 RNNSharp.RNNEncoder.Train() 在 RNNSharpConsole.Program.Main(String[] args)


Is there any limit to the lengh of sentences? (P.S. the sentences in my task is very long for each sentence containing all words in a document) Have you handled Vanishing gradient problem by some manipulation such as mini-batch like this code http://deeplearning.net/tutorial/rnnslu.html#dataset

zhongkaifu commented 8 years ago

No, we don't have limitation about it. Could you please share call stack information about this exception ? How long are your sentences ? For BPTT, we have mini-batch and the default value is 10. You can modify the value by setting bptt_block variable. For LSTM, currently, we don't have it yet.

ppsunrise commented 8 years ago

Thank you for your quick response, and all information are as follow:


D:\RNNSharp>REM Build template feature set from training corpus

D:\RNNSharp>TFeatureBin.exe -mode build -template template.txt -inputfile KE-tra in.txt -ftrfile tfeatures -minfreq 1 Loading feature template from template.txt... Generate feature set... Filter out features whose frequency is less than 1 100000... D:\RNNSharp>REM Encoding LSTM-RNN-CRF model

D:\RNNSharp>RNNSharpConsole.exe -mode train -trainfile KE-train.txt -validfile K E-valid.txt -modeltype 1 -modelfile KE-model.bin -ftrfile features.txt -tagfile KE-tags.txt -layersize 100 -alpha 0.1 -crf 1 -maxiter 50 -savestep 200K 1>KE.tr ainout

未经处理的异常: System.IndexOutOfRangeException: 索引超出了数组界限。 在 RNNSharp.RNN.ForwardBackward(Int32 numStates, Double[][] m_RawOutput) 在 RNNSharp.RNN.learnSentenceForRNNCRF(Sequence pSequence) 在 RNNSharp.RNN.TrainNet() 在 RNNSharp.RNNEncoder.Train() 在 RNNSharpConsole.Program.Main(String[] args)

D:\RNNSharp>pause


That's all my output messages. And in my task, I regarded a document as a sentence. So each one have 500~800 words.

zhongkaifu commented 8 years ago

@ppsunrise Could you please host RNNSharp by Visual Studio and run your command line in debug mode ? If any exception happens, Visual Studio would show you which line throw out exception, and you could show these information to me.

ppsunrise commented 8 years ago

I debugged and got the information as the pictures show: 1 2 3 4

zhongkaifu commented 8 years ago

In your third screenshot, can you please load RNNSharp.pdb which is a symbol file for debugging, and then we will know which exact line throw out exception.

ppsunrise commented 8 years ago

hi, zhong, Thank you for kind response! I copy my files to the path RNNSharp\RNNSharpConsole\bin\Debug, and then run RNNSharpConsole/Program.cs with parameter [-mode train -trainfile KE-train.txt -validfile KE-valid.txt -modeltype 1 -modelfile KE-model.bin -ftrfile features.txt -tagfile KE-tags.txt -layersize 100 -alpha 0.1 -crf 1 -maxiter 50 -savestep 200K > KE.trainout] and I got the following result. image


Sorry to confuse you but I don't known how to run the project RNNSharp-master at Github

zhongkaifu commented 8 years ago

The root cause of above exception is that your CPU SIMD register size is less than 256bit. This exception is not related to your first exception (IndexOutOfRangeException).

As I mentioned, your 4th screenshot points to the correct spot of the real IndexOutOfRangeException exception. In order to get corresponding code, you need to load RNNSharp.pdb file.

So, firstly, you need to run RNNSharp with your data in release version. Once exception happens, catch the exception by Visual Studio and load RNNSharp.pdb file, then Visual Studio will show which code throw out the exception.