zhongkaifu / Seq2SeqSharp

Seq2SeqSharp is a tensor based fast & flexible deep neural network framework written by .NET (C#). It has many highlighted features, such as automatic differentiation, different network types (Transformer, LSTM, BiLSTM and so on), multi-GPUs supported, cross-platforms (Windows, Linux, x86, x64, ARM), multimodal model for text and images and so on.
Other
193 stars 38 forks source link

Target vocabulary size fixed to 45000 #63

Closed clm33 closed 1 year ago

clm33 commented 1 year ago

Hi, Zhongkaifu.

I am trying to train a model with GPTConsole and no matter the amount of words there are in my corpus, the embedding matrix always has a dimension fixed to 45000. I have tried to control this by varying some parameters, such as "TgtVocabSize", but it changes nothing. It seems as if 45000 is an upper limit. Is that the case?

zhongkaifu commented 1 year ago

Hi @clm33 ,

No, it doesn't have such limitations. Can you please share your config file and log file, and I can take a look.

clm33 commented 1 year ago

config.json.txt Seq2SeqConsole_Train_2023_02_24_22h_23m_40s.log

It is particularly the last line of the log what bothers me. It may be a mild issue, but I do not understand why the embedding matrix is always created with 45000 words no matter if there are more or less words.

Thanks for taking a look.

zhongkaifu commented 1 year ago

I just checked your log and found it tried to load existing model from 'C:/Users/User/Desktop/Carlos/Universidad/master/Segundo_curso/Practicas/TFM_final/carlos/Autorregresivo/embedding.model' So, your new training has to use the same vocabulary as in that embedding.model, otherwise, your new training will have mismatched vocabulary with your existing model.

For pretrained and fine-tuning pattern, the fine-tuning part should use the same vocabulary as pretrained model.

clm33 commented 1 year ago

You are right. That was probably the issue. A silly mistake.

Thanks a lot for your help. It is very nice from you.

zhongkaifu commented 1 year ago

You are welcome.