train_model model load repaired

sail-sg / lorahub

[COLM 2024] LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition

MIT License

583 stars 35 forks source link

train_model model load repaired #21

Open JornyWan opened 1 year ago

JornyWan commented 1 year ago

In the code train_model.py: model = AutoModelForSeq2SeqLM.from_pretrained( model_args.model_name_or_path, from_tf=bool(".ckpt" in model_args.model_name_or_path), config=config, cache_dir=model_args.cache_dir, revision=model_args.model_revision, use_auth_token=True if model_args.use_auth_token else None, ) if anyone could not use this to initial a flan-t5 model from AutoModelForSeq2SeqLM, then you need to add the following in params: unk_token="", bos_token="~~", eos_token="~~" Thanks!

SivilTaram commented 1 year ago

@JornyWan Thanks for your feedback and pull request! May I know your transformers version? I have never encountered the problem, not sure if it is the case for the latest transformers library.

JornyWan commented 1 year ago

@SivilTaram thanks for quick response, my transformer version is 4.31.0

JornyWan commented 1 year ago

actually it would be like:

JornyWan commented 1 year ago

@SivilTaram you can just put it to another branch for different version of transformer use if it is the problem with version