zhongkaifu / Seq2SeqSharp

Seq2SeqSharp is a tensor based fast & flexible deep neural network framework written by .NET (C#). It has many highlighted features, such as automatic differentiation, different network types (Transformer, LSTM, BiLSTM and so on), multi-GPUs supported, cross-platforms (Windows, Linux, x86, x64, ARM), multimodal model for text and images and so on.
Other
193 stars 38 forks source link

Didn't save the model? #55

Closed lijianxin520 closed 1 year ago

lijianxin520 commented 1 year ago

Here is my configuration file: { "Task":"Train", "EmbeddingDim":512, "HiddenSize":512, "StartLearningRate":0.001, "WeightsUpdateCount":0, "EncoderLayerDepth":6, "DecoderLayerDepth":6, "ModelFilePath":"D:\GPU\cutwords\model\seq_cut_word.model", "SrcVocab":null, "TgtVocab":null, "SrcVocabSize":300000, "TgtVocabSize":300000, "SharedEmbeddings":false, "SrcEmbeddingModelFilePath":null, "TgtEmbeddingModelFilePath":null, "TrainCorpusPath":".\data\train\cut\train.conll.txt", "ValidCorpusPaths":null, "InputTestFile":null, "OutputTestFile":null, "ShuffleType":"NoPadding", "ShuffleBlockSize":-1, "GradClip":5.0, "BatchSize":256, "ValBatchSize":128, "DropoutRatio":0, "ProcessorType":"GPU", "EncoderType":"Transformer", "MultiHeadNum":8, "DeviceIds":"0", "BeamSearchSize":1, "MaxEpochNum":100, "MaxTrainSentLength":10000, "MaxTestSentLength":10000, "WarmUpSteps":8000, "VisualizeNNFilePath":null, "Beta1":0.9, "Beta2":0.98, "ValidIntervalHours":1.0, "EnableCoverageModel":false, "CompilerOptions":"", "Optimizer":"Adam" } The model was not saved after the training。

zhongkaifu commented 1 year ago

Can you please also share your logs here ?

zhongkaifu commented 1 year ago

I just made a quick code change that always saves model after training. (Currently, Seq2SeqSharp only save model after it runs validation and get better result (validation result or lower cost)). You can pull the latest code from the repo, and retry it.

lijianxin520 commented 1 year ago

This is a log of my 10 attempts at training; I trained 100 times without saving the model. SeqLabelConsole_2022_07_28_12h_14m_57s.log

zhongkaifu commented 1 year ago

Your training got finished only after 1,000 updates which is pretty small and even warmup (8,000 steps) didn't finished. I can try to reduce your batch size to a smaller number, and retrain your training.

Anyway, as I mentioned in above, I already made a code change that always save model after training. You can apply and retry it.

lijianxin520 commented 1 year ago

Thanks for your help. I'll give it a try!

piedralaves commented 1 year ago

Could you please indicate the change specifing the line and the modification? Thanks a lot.

zhongkaifu commented 1 year ago

@piedralaves Please check this commit: https://github.com/zhongkaifu/Seq2SeqSharp/commit/1e31aef8534581d7e4b65853947e8209e2042e41