Closed piedralaves closed 3 months ago
Hi @piedralaves
I guess it maybe data issue. Can you please set a breakpoint and check srcTokensGroups and tgtTokensGroups in Seq2SeqCorpusBatch.CreateBatch(...) when you run it.
In addition, Release 2_5_0 is an older version (2 years ago), you could update it to the latest version, and the problem may be gone.
Thanks Zhongkai Fu
Thanks. No, althought it says "2_5_0", it is a newer version. The folder is called that way, but it is a recent version.
Ok. I write a try-catch in that point:
public override void CreateBatch(List<SntPair> sntPairs)
{
try
{
base.CreateBatch(sntPairs);
TryAddPrefix(SrcTknsGroups[0], BuildInTokens.BOS);
TryAddSuffix(SrcTknsGroups[0], BuildInTokens.EOS);
TryAddPrefix(TgtTknsGroups[0], BuildInTokens.BOS);
TryAddSuffix(TgtTknsGroups[0], BuildInTokens.EOS);
}
catch
{
foreach (var p in sntPairs)
{
Console.WriteLine(p.PrintSrcTokens());
Console.WriteLine(p.PrintTgtTokens());
}
throw;
}
}
Is that what you refer?
It looks still older version. The new version of this function looks like this:
public class Seq2SeqCorpusBatch : CorpusBatch
{
public override void CreateBatch(List<IPair> sntPairs)
{
base.CreateBatch(sntPairs);
TryAddPrefix(SrcBatchTokens, BuildInTokens.BOS);
TryAddSuffix(SrcBatchTokens, BuildInTokens.EOS);
TryAddPrefix(TgtBatchTokens, BuildInTokens.BOS);
TryAddSuffix(TgtBatchTokens, BuildInTokens.EOS);
}
Well, the fact is that we have done some modifications in this code and it is not easy to just download the newest.
Ok. I think it is fixed. Thanks a lot. G
Great! I'm glad to hear that. For your modification, you could use "git merge" to merge your modification with the latest update in the main branch. (It may have conflicts if both you and the main branch modified the same parts, so it will need you to manually resolve it.)
Hi Zhongkaifu:
I have this error. Could you help me a little, please?:
info,21/03/2024 19:00:40 Processing batch '88/110' '8,0000e+001%'. info,21/03/2024 19:00:40 Processing batch '90/110' '8,1818e+001%'. info,21/03/2024 19:00:40 Update = 20000, Epoch = 181, LR = 3,7947e-004, AvgCost = 8,4468e-001, Sent = 10395, SentPerMin = 29234,13, WordPerSec = 10377,48 info,21/03/2024 19:00:40 Start to build index for data set. info,21/03/2024 19:00:40 Loading and shuffling corpus from '1' files. info,21/03/2024 19:00:40 Shuffled '3' sentence pairs. info,21/03/2024 19:00:40 AggregateSrcLength = 'NoPadding' info,21/03/2024 19:00:40 Src token length distribution info,21/03/2024 19:00:40 0 ~ 100: 3 (acc: 100,00%) info,21/03/2024 19:00:40 Tgt token length distribution info,21/03/2024 19:00:40 0 ~ 100: 3 (acc: 100,00%) info,21/03/2024 19:00:40 Finished to build index for data set. info,21/03/2024 19:00:40 Start to sort and shuffle data set by length. info,21/03/2024 19:00:40 Finished to sort and shuffle data set by length. Total batch size = '3' info,21/03/2024 19:00:40 Processing batch '1/3' '3,3333e+001%'. info,21/03/2024 19:00:40 Processing batch '2/3' '6,6667e+001%'. info,21/03/2024 19:00:40 Processing batch '3/3' '1,0000e+002%'. info,21/03/2024 19:00:40 Exception: 'Index was out of range. Must be non-negative and less than the size of the collection. (Parameter 'index')' info,21/03/2024 19:00:40 Call stack: ' at System.Collections.Generic.List
1.get_Item(Int32 index) at Seq2SeqSharp.Corpus.Seq2SeqCorpusBatch.CreateBatch(List
1 sntPairs) in C:\Users\jorge\source\repos\Seq2SeqSharp-RELEASE_2_5_0\Seq2SeqSharp\Corpus\Seq2SeqCorpusBatch.cs:line 173 at Seq2SeqSharp.Tools.ParallelCorpus1.GetEnumerator()+MoveNext() in C:\Users\jorge\source\repos\Seq2SeqSharp-RELEASE_2_5_0\Seq2SeqSharp\Corpus\ParallelCorpus.cs:line 438 at Seq2SeqSharp.Tools.BaseSeq2SeqFramework
1.RunValid(ICorpus1 validCorpus, Func
5 RunNetwork, Dictionary2 taskId2metrics, DecodingOptions decodingOptions, Boolean outputToFile, String prefixName) in C:\Users\jorge\source\repos\Seq2SeqSharp-RELEASE_2_5_0\Seq2SeqSharp\Tools\BaseSeq2SeqFramework.cs:line 858 at Seq2SeqSharp.Tools.BaseSeq2SeqFramework
1.CreateCheckPoint(ICorpus1[] validCorpusList, Dictionary
2 taskId2metrics, DecodingOptions decodingOptions, Func5 forwardOnSingleDevice, Double avgCostPerWordInTotal) in C:\Users\jorge\source\repos\Seq2SeqSharp-RELEASE_2_5_0\Seq2SeqSharp\Tools\BaseSeq2SeqFramework.cs:line 585 at Seq2SeqSharp.Tools.BaseSeq2SeqFramework
1.TrainOneEpoch(Int32 ep, ICorpus1 trainCorpus, ICorpus
1[] validCorpusList, ILearningRate learningRate, IOptimizer solver, Dictionary2 taskId2metrics, DecodingOptions decodingOptions, Func
5 forwardOnSingleDevice) in C:\Users\jorge\source\repos\Seq2SeqSharp-RELEASE_2_5_0\Seq2SeqSharp\Tools\BaseSeq2SeqFramework.cs:line 415 at Seq2SeqSharp.Tools.BaseSeq2SeqFramework1.Train(Int32 maxTrainingEpoch, ICorpus
1 trainCorpus, ICorpus1[] validCorpusList, ILearningRate learningRate, Dictionary
2 taskId2metrics, IOptimizer optimizer, DecodingOptions decodingOptions) in C:\Users\jorge\source\repos\Seq2SeqSharp-RELEASE_2_5_0\Seq2SeqSharp\Tools\BaseSeq2SeqFramework.cs:line 263 at Seq2SeqSharp.Tools.BaseSeq2SeqFramework1.Train(Int32 maxTrainingEpoch, ICorpus
1 trainCorpus, ICorpus1[] validCorpusList, ILearningRate learningRate, List
1 metrics, IOptimizer optimizer, DecodingOptions decodingOptions) in C:\Users\jorge\source\repos\Seq2SeqSharp-RELEASE_2_5_0\Seq2SeqSharp\Tools\BaseSeq2SeqFramework.cs:line 277 at Seq2SeqConsole.Program.Main(String[] args) in C:\Users\jorge\source\repos\Seq2SeqSharp-RELEASE_2_5_0\Tools\Seq2SeqConsole\Program.cs:line 146'