thu-coai / CommonsenseStoryGen

Implementation for paper "A Knowledge-Enhanced Pretraining Model for Commonsense Story Generation"
103 stars 14 forks source link

How to run on GPU? #1

Closed GenTxt closed 4 years ago

GenTxt commented 4 years ago

Thanks for this very interesting repo.

Currently have training working according to your instructions using tensorflow 1.15 and ubuntu 18.04 but nvidia-smi reports that python 3 is using only 81 megs. GPU is default 0

free -m shows 8 gigs being used of possible 16.

Using 0-th gpu ... begin loading dataset...... loading ./data/roc ...... etc.

Initialize all the fine-tuning parameter. Reading model parameters from ./model/gpt2 and initialize the parameters for fine-tuning. Gen epoch 1 learning rate 0.0001 epoch-time 27370.0307: PPL on training set: 8.787691 PPL on validation set: 8.077976 PPL on testing set: 8.077976 saving parameters in ./model/gpt2 etc.

I'm familiar with other tensorflow fine-tuning repos and they all access gpu 0 without issue. Is there a flag I'm missing?

Have added --gpu 0 to commands but memory use is the same.

Any suggestions are appreciated.

Cheers.

GenTxt commented 4 years ago

Problem solved. Broken tf 1.5 installation

tf.test.is_gpu_available()

Fixed all errors and running 100% on GPU.

Thanks

1245244103 commented 3 years ago

Problem solved. Broken tf 1.5 installation

tf.test.is_gpu_available()

Fixed all errors and running 100% on GPU.

Thanks

hi,can you tell me the version of your tf,cuda?I install tf 1.12.0 and cuda 10.1. But the code have some errors when it is running.

GenTxt commented 3 years ago

From 2020 I believe it's tf 1.15 gpu

Haven't used that repo since

Good luck

On Sat, Apr 10, 2021 at 4:25 AM 1245244103 @.***> wrote:

Problem solved. Broken tf 1.5 installation

tf.test.is_gpu_available()

Fixed all errors and running 100% on GPU.

Thanks

hi,can you tell me the version of your tf,cuda?I install tf 1.12.0 and cuda 10.1. But the code have some errors when it is running.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/thu-coai/CommonsenseStoryGen/issues/1#issuecomment-817099988, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFMAWPM433V5MS23XUJV5T3TIADQNANCNFSM4KLSKHQA .

1245244103 commented 3 years ago

From 2020 I believe it's tf 1.15 gpu Haven't used that repo since Good luck On Sat, Apr 10, 2021 at 4:25 AM 1245244103 @.***> wrote: Problem solved. Broken tf 1.5 installation tf.test.is_gpu_available() Fixed all errors and running 100% on GPU. Thanks hi,can you tell me the version of your tf,cuda?I install tf 1.12.0 and cuda 10.1. But the code have some errors when it is running. — You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub <#1 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFMAWPM433V5MS23XUJV5T3TIADQNANCNFSM4KLSKHQA .

Thanks a lot ! I have run the code successfully. However,do you remember the epochs you spent? I train for 1 epcoh on the kg and 3 epoch on the multi_roc. After that, it seemed to have overfitting.

GenTxt commented 3 years ago

Not more than 1 or 2 for same reason

On Fri, Apr 16, 2021 at 10:33 PM 1245244103 @.***> wrote:

From 2020 I believe it's tf 1.15 gpu Haven't used that repo since Good luck … <#m-6536062565944230135> On Sat, Apr 10, 2021 at 4:25 AM 1245244103 @.***> wrote: Problem solved. Broken tf 1.5 installation tf.test.is_gpu_available() Fixed all errors and running 100% on GPU. Thanks hi,can you tell me the version of your tf,cuda?I install tf 1.12.0 and cuda 10.1. But the code have some errors when it is running. — You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub <#1 (comment) https://github.com/thu-coai/CommonsenseStoryGen/issues/1#issuecomment-817099988>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFMAWPM433V5MS23XUJV5T3TIADQNANCNFSM4KLSKHQA .

Thanks a lot ! I have run the code successfully. However,do you remember the epochs you spent? I train for 1 epcoh on the kg and 3 epoch on the multi_roc. After that, it seemed to have overfitting.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/thu-coai/CommonsenseStoryGen/issues/1#issuecomment-821752548, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFMAWPMMLITV36HVGNXLDJDTJDXN7ANCNFSM4KLSKHQA .

1245244103 commented 3 years ago

Thank you for your answer!