Open RK-BAKU opened 2 years ago
This is important! Any help on this one?
@RK-BAKU @mgh1
Hi, for me I just load the model I saved and then keep training on the model:
model.load_model("t5", 'file/to/your/trained/model', use_gpu=True)
MAX_EPOCHS = 3
torch.cuda.memory_summary(device=None, abbreviated=False) torch.utils.checkpoint
model.train(train_df=df[0:(int)(0.7TRAINNING_SIZE)], eval_df=df[(int)(0.7TRAINNING_SIZE):TRAINNING_SIZE], source_max_token_len=MAX_LEN, target_max_token_len=SUMMARY_LEN, batch_size=5, max_epochs=MAX_EPOCHS, outputdir='/content/gdrive/MyDrive/HW5_HL_gen/t5model',use_gpu=True)
@RK-BAKU @mgh1
Hi, for me I just load the model I saved and then keep training on the model:
model.load_model("t5", 'file/to/your/trained/model', use_gpu=True)
the rest is all the same for training
MAX_EPOCHS = 3
torch.cuda.memory_summary(device=None, abbreviated=False) torch.utils.checkpoint
model.train(train_df=df[0:(int)(0.7_TRAINNING_SIZE)], eval_df=df[(int)(0.7_TRAINNING_SIZE):TRAINNING_SIZE], source_max_token_len=MAX_LEN, target_max_token_len=SUMMARY_LEN, batch_size=5, max_epochs=MAX_EPOCHS, outputdir='/content/gdrive/MyDrive/HW5_HL_gen/t5model',use_gpu=True)
How do you save the model?
Because there doesnt seems to be any save model.
Hi guys! Is it possible to continue training from specific checkpoint?