zhongkaifu / Seq2SeqSharp

Seq2SeqSharp is a tensor based fast & flexible deep neural network framework written by .NET (C#). It has many highlighted features, such as automatic differentiation, different network types (Transformer, LSTM, BiLSTM and so on), multi-GPUs supported, cross-platforms (Windows, Linux, x86, x64, ARM), multimodal model for text and images and so on.
Other
193 stars 38 forks source link

strange Error #86

Closed piedralaves closed 3 months ago

piedralaves commented 3 months ago

Hi Zhongkaifu:

I have this error. Could you help me a little, please?:

info,21/03/2024 19:00:40 Processing batch '88/110' '8,0000e+001%'. info,21/03/2024 19:00:40 Processing batch '90/110' '8,1818e+001%'. info,21/03/2024 19:00:40 Update = 20000, Epoch = 181, LR = 3,7947e-004, AvgCost = 8,4468e-001, Sent = 10395, SentPerMin = 29234,13, WordPerSec = 10377,48 info,21/03/2024 19:00:40 Start to build index for data set. info,21/03/2024 19:00:40 Loading and shuffling corpus from '1' files. info,21/03/2024 19:00:40 Shuffled '3' sentence pairs. info,21/03/2024 19:00:40 AggregateSrcLength = 'NoPadding' info,21/03/2024 19:00:40 Src token length distribution info,21/03/2024 19:00:40 0 ~ 100: 3 (acc: 100,00%) info,21/03/2024 19:00:40 Tgt token length distribution info,21/03/2024 19:00:40 0 ~ 100: 3 (acc: 100,00%) info,21/03/2024 19:00:40 Finished to build index for data set. info,21/03/2024 19:00:40 Start to sort and shuffle data set by length. info,21/03/2024 19:00:40 Finished to sort and shuffle data set by length. Total batch size = '3' info,21/03/2024 19:00:40 Processing batch '1/3' '3,3333e+001%'. info,21/03/2024 19:00:40 Processing batch '2/3' '6,6667e+001%'. info,21/03/2024 19:00:40 Processing batch '3/3' '1,0000e+002%'. info,21/03/2024 19:00:40 Exception: 'Index was out of range. Must be non-negative and less than the size of the collection. (Parameter 'index')' info,21/03/2024 19:00:40 Call stack: ' at System.Collections.Generic.List1.get_Item(Int32 index) at Seq2SeqSharp.Corpus.Seq2SeqCorpusBatch.CreateBatch(List1 sntPairs) in C:\Users\jorge\source\repos\Seq2SeqSharp-RELEASE_2_5_0\Seq2SeqSharp\Corpus\Seq2SeqCorpusBatch.cs:line 173 at Seq2SeqSharp.Tools.ParallelCorpus1.GetEnumerator()+MoveNext() in C:\Users\jorge\source\repos\Seq2SeqSharp-RELEASE_2_5_0\Seq2SeqSharp\Corpus\ParallelCorpus.cs:line 438 at Seq2SeqSharp.Tools.BaseSeq2SeqFramework1.RunValid(ICorpus1 validCorpus, Func5 RunNetwork, Dictionary2 taskId2metrics, DecodingOptions decodingOptions, Boolean outputToFile, String prefixName) in C:\Users\jorge\source\repos\Seq2SeqSharp-RELEASE_2_5_0\Seq2SeqSharp\Tools\BaseSeq2SeqFramework.cs:line 858 at Seq2SeqSharp.Tools.BaseSeq2SeqFramework1.CreateCheckPoint(ICorpus1[] validCorpusList, Dictionary2 taskId2metrics, DecodingOptions decodingOptions, Func5 forwardOnSingleDevice, Double avgCostPerWordInTotal) in C:\Users\jorge\source\repos\Seq2SeqSharp-RELEASE_2_5_0\Seq2SeqSharp\Tools\BaseSeq2SeqFramework.cs:line 585 at Seq2SeqSharp.Tools.BaseSeq2SeqFramework1.TrainOneEpoch(Int32 ep, ICorpus1 trainCorpus, ICorpus1[] validCorpusList, ILearningRate learningRate, IOptimizer solver, Dictionary2 taskId2metrics, DecodingOptions decodingOptions, Func5 forwardOnSingleDevice) in C:\Users\jorge\source\repos\Seq2SeqSharp-RELEASE_2_5_0\Seq2SeqSharp\Tools\BaseSeq2SeqFramework.cs:line 415 at Seq2SeqSharp.Tools.BaseSeq2SeqFramework1.Train(Int32 maxTrainingEpoch, ICorpus1 trainCorpus, ICorpus1[] validCorpusList, ILearningRate learningRate, Dictionary2 taskId2metrics, IOptimizer optimizer, DecodingOptions decodingOptions) in C:\Users\jorge\source\repos\Seq2SeqSharp-RELEASE_2_5_0\Seq2SeqSharp\Tools\BaseSeq2SeqFramework.cs:line 263 at Seq2SeqSharp.Tools.BaseSeq2SeqFramework1.Train(Int32 maxTrainingEpoch, ICorpus1 trainCorpus, ICorpus1[] validCorpusList, ILearningRate learningRate, List1 metrics, IOptimizer optimizer, DecodingOptions decodingOptions) in C:\Users\jorge\source\repos\Seq2SeqSharp-RELEASE_2_5_0\Seq2SeqSharp\Tools\BaseSeq2SeqFramework.cs:line 277 at Seq2SeqConsole.Program.Main(String[] args) in C:\Users\jorge\source\repos\Seq2SeqSharp-RELEASE_2_5_0\Tools\Seq2SeqConsole\Program.cs:line 146'

zhongkaifu commented 3 months ago

Hi @piedralaves

I guess it maybe data issue. Can you please set a breakpoint and check srcTokensGroups and tgtTokensGroups in Seq2SeqCorpusBatch.CreateBatch(...) when you run it.

In addition, Release 2_5_0 is an older version (2 years ago), you could update it to the latest version, and the problem may be gone.

Thanks Zhongkai Fu

piedralaves commented 3 months ago

Thanks. No, althought it says "2_5_0", it is a newer version. The folder is called that way, but it is a recent version.

Ok. I write a try-catch in that point:

public override void CreateBatch(List<SntPair> sntPairs)
        {
            try
            {
                base.CreateBatch(sntPairs);

                TryAddPrefix(SrcTknsGroups[0], BuildInTokens.BOS);
                TryAddSuffix(SrcTknsGroups[0], BuildInTokens.EOS);
                TryAddPrefix(TgtTknsGroups[0], BuildInTokens.BOS);
                TryAddSuffix(TgtTknsGroups[0], BuildInTokens.EOS);

            }
            catch
            {
                foreach (var p in sntPairs)
                {
                    Console.WriteLine(p.PrintSrcTokens());
                    Console.WriteLine(p.PrintTgtTokens());
                }
                throw;
            }          
        }

Is that what you refer?

zhongkaifu commented 3 months ago

It looks still older version. The new version of this function looks like this:

public class Seq2SeqCorpusBatch : CorpusBatch
{

    public override void CreateBatch(List<IPair> sntPairs)
    {
        base.CreateBatch(sntPairs);

        TryAddPrefix(SrcBatchTokens, BuildInTokens.BOS);
        TryAddSuffix(SrcBatchTokens, BuildInTokens.EOS);
        TryAddPrefix(TgtBatchTokens, BuildInTokens.BOS);
        TryAddSuffix(TgtBatchTokens, BuildInTokens.EOS);
    }
piedralaves commented 3 months ago

Well, the fact is that we have done some modifications in this code and it is not easy to just download the newest.

piedralaves commented 3 months ago

Ok. I think it is fixed. Thanks a lot. G

zhongkaifu commented 3 months ago

Great! I'm glad to hear that. For your modification, you could use "git merge" to merge your modification with the latest update in the main branch. (It may have conflicts if both you and the main branch modified the same parts, so it will need you to manually resolve it.)