wasiahmad / PLBART

Official code of our work, Unified Pre-training for Program Understanding and Generation [NAACL 2021].
https://arxiv.org/abs/2103.06333
MIT License
186 stars 35 forks source link

Processing only one language #40

Closed CosmoLuminous closed 2 years ago

CosmoLuminous commented 2 years ago

Hi @wasiahmad,

I have following queries:

  1. Could you please guide me on the steps if we want to pre-train the model only for the downstream tasks which require only one Programming language (Eg. Clone detection). Does this library incorporates this scenario, or do I need to make certain changes to accommodate it?

  2. Also, the use of number of GPUs has been hardcoded for 8 GPU and not taken as an input argument from the user. Will this work if we have only 1 or 2 GPUs on the system?

  3. What is the early stopping criteria for pre-training? What is the hyperparameter for the same

Thank you, Aman

wasiahmad commented 2 years ago

Hello,

  1. Yes, you need to make small changes. For example, you can only use your target language(s) during pretraining as mentioned here. If you do not need a certain language, just skip preparing its corresponding pretraining data.
  2. This is not true. You can always vary the number of GPUs (see here). Please, make sure to adjust other hyper-parameters to achieve the desired batch size.
  3. We used a fixed number of pre-training steps (100k). You can use validation loss as a stopping criterion too.

Thanks!

CosmoLuminous commented 2 years ago

Thank you for your response.

regards, Aman