OSU-STARLAB / Simul-LLM

[ACL 2024] An easily extensible framework for simultaneous, text-to-text neural machine translation (SimulMT) for LLMs.
MIT License
11 stars 2 forks source link

Issues in SimulMask example #13

Closed gpengzhi closed 3 weeks ago

gpengzhi commented 4 weeks ago

Thanks for the great work!

When I followed the instructions in the SimulMask example, I met the following error

finetune.py: error: unrecognized arguments: --weight-decay 0.1

Is weight-decay missing in trainer_wrapper.py?

gpengzhi commented 4 weeks ago

BTW, are the training configurations shown in the SimualMask example exactly the same as those used in your paper (LoRA and quantization are not discussed in your paper)?

agostinv commented 3 weeks ago

Thanks for the great work!

When I followed the instructions in the SimulMask example, I met the following error

finetune.py: error: unrecognized arguments: --weight-decay 0.1

Is weight-decay missing in trainer_wrapper.py?

Hey @gpengzhi!

Yes, it is probably just missing. I'll fix it really quick and add a few more options for fine-tuning.

BTW, are the training configurations shown in the SimualMask example exactly the same as those used in your paper (LoRA and quantization are not discussed in your paper)?

Saw your other issue, @Meaffel should be able to assist with reproduction.

agostinv commented 3 weeks ago

I'll look a bit further at what actually useful arguments we should still add interfaces for (some are obviously not useful or, worse, result in buggy behavior), but that's outside the scope of this and issue #14 covers the other comment, here.

Closing the issue.

gpengzhi commented 3 weeks ago

Thanks!

BTW, Parameters such asnum_train_epochs and save_strategy should also be included in self.training_arguments of LLMSimulSFTTrainerWrapper.