The pre-request and training process

princeton-nlp / CoFiPruning

[ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408

MIT License

188 stars 32 forks source link

The pre-request and training process #46

Open oijt894 opened 1 year ago

oijt894 commented 1 year ago

I appreciate the great work @xiamengzhou, but sorry that I cannot clearly understand the training process.

q1) could you specify the versions of packages, e.g. datasets, transformers, etc.? q2) can I get the fine-tuned original BERT by running run_FT.sh with the specification of 'proj_dir' only?

xiamengzhou commented 1 year ago

1) Hi, it should be compatible with transformers==4.17.0 and datasets==1.14.0 but might work with versions beyond. 2) Yes, and you might want to customize your output_dir

oijt894 commented 1 year ago

Thanks for the answer! BTW, from the issue history, it seems that someone learned this repo in DDP. I also tried, and learned the original Bert (teacher model) in DDP. But, during CoFi, I got this error "RuntimeError: Expected to mark a variable ready only once." Do you have any idea for this? or Some tips to use multiple gpus?