chijames / KERPLE

Apache License 2.0
16 stars 1 forks source link

which model weights do you use? gpt-neox slim weights or full weights #4

Closed vangogh0318 closed 10 months ago

vangogh0318 commented 10 months ago

Thanks for sharing your great research.

To reproduce the experimental results of KERPLE paper, I want to re-run the kerple code. I have a question, which model weights do you use? gpt-neox slim weights or full weights. thank you.

chijames commented 10 months ago

Hi,

Thank you for your interest in our work!

We trained the model from scratch. You can refer to this section for details.

Thanks.

vangogh0318 commented 10 months ago

Thank you for your reply. Now I know how to get table 3 results of paper

You mentioned that we trained the model from scratch. I have another question. You mean just download the dataset, and run train.sh to train model from scratch? we do not need download gpt-neox model?(in README_gpt_neox.md, it need download neox model)
thanks very much

chijames commented 10 months ago

Hi,

Sorry for the confusion. To reproduce the results in our paper, you can download the model checkpoints we released here and run test.sh.

vangogh0318 commented 10 months ago

Sorry. I express it not clearly

  1. The first question, I got the answer. Now I know how to reproduce the table 3 results of paper. Just run test.sh, download checkpoint.
  2. I have another question, the second question. run train.sh file to train model. This step need download gpt-neox model ? if need, which model weights should I download ? silm weights or full weights?

thank you very very much

chijames commented 10 months ago

Hi,

So if you want to train a new model from scratch, you do not need to download any model checkpoints from gpt-neox. You will instead initialize the model randomly and use your own GPUs to train the model. We were just re-using the gpt-neox training code for all the heavy lifting since it provides an efficient implementation.

Thanks.

vangogh0318 commented 10 months ago

thanks very much. kerple is a very great research.