Closed ziweizh24 closed 4 months ago
Do you get any warnings when you reload the model? (Set up logging if you haven't: logging.basicConfig(level=logging.INFO)
)
Does it work as expected if you reload the model with Simple Transformers and use model.predict()
?
Do you get any warnings when you reload the model? (Set up logging if you haven't:
logging.basicConfig(level=logging.INFO)
)Does it work as expected if you reload the model with Simple Transformers and use
model.predict()
?
I did get the warning saying that not all weights are initialized when loading the model using MarianMTModel.from_pretrained('outputs/best_model')
.
Could you say a bit more about how to reload the model (PATH='outputs/best_model/'
) with Simple Transformer (I assume it will be using Seq2SeqModel
)? Is Seq2SeqModel.from_pretrained(<PATH>)
supported?
To load with ST, you'd do:
model = Seq2SeqModel(
encoder_decoder_type="marian",
encoder_decoder_name="<PATH>",
args=model_args,
use_cuda=True,
)
In theory, Seq2SeqModel.from_pretrained(<PATH>)
is also supported since ST uses a Huggingface model under the hood. I don't remember this exactly, but maybe Marian encoder-decoder models are a special case where this doesn't work (due to how the encoder and the decoder are set up).
i initialized and trained the following model:
After training,
model.predict(['this is a test'])
gives me desired output. However, when I loaded back this model to make prediction. The output is off:Anything i missed?