openai / finetune-transformer-lm

Code and model for the paper "Improving Language Understanding by Generative Pre-Training"
https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf
MIT License
2.15k stars 503 forks source link

Cannot reproduce this experiment #15

Closed Polynomia closed 6 years ago

Polynomia commented 6 years ago

Hi,

I tried to run this code to reproduce the accuracy of 85%. But I have executed 3 times and got only 53% accuracy each time. Could you please tell me some tips about the instructions to run this code or set parameters? I did't change any code and used the command python train.py --dataset rocstories --desc rocstories --submit --analysis --data_dir [path to data here] in readme.

Thanks

Franck-Dernoncourt commented 6 years ago

I get accuracy of ~85% on ROC 2016 test set without having to change any parameters aside from --n_gpu, which I set to 1, just because I wanted to use only 1 GPU (if --n_gpu 2, then the default training batch size is 8x2=16, which caused out of memory issues on one GTX 1080 (8GB of GPU memory). By default n_gpu=4).

Polynomia commented 6 years ago

Thank you so much for your reply. I got 87% after I set n_gpu to 1. I think there may be a problem in implementation of multi-gpu. I test default n_gpu = 4 on a server with 8 Tesla K80 4 times but only get 53%.

wykdg commented 6 years ago

i got 53% in python2.7 with tensorflow 1.2,while 87% with python 3.6 and tensorflow 1.6.maybe you should check the version of tensorflow

Newmu commented 6 years ago

This looks to have been due to an implementation difference between python 2 and python 3. Only python 3 is supported but I've removed the code that was likely causing this issue as discussed in #21 so python 2 should work fine now.