Closed mikeda100 closed 1 year ago
Thank you for your interest in this project. Switching from v1 to v2 does not require any data reprocessing. To maintain usability, there are no changes to the data format. You only need to run the command you wrote before.
Hi there,
Thanks a lot for the excellent work V2.0 release.
Could you please tell me if we need to re-process all data from scratch, since the data format got changed?
What are the scripts that we should run sequentially?
That's to say, what data preparation steps(scripts) shall we run before executing the following command?
accelerate launch --config_file configs/default_config.yaml train_lm.py --config configs/pretrain_config.yaml
Thanks again!