nouhadziri / THRED

The implementation of the paper "Augmenting Neural Response Generation with Context-Aware Topical Attention"
https://arxiv.org/abs/1811.01063
MIT License
110 stars 25 forks source link

Train-time #14

Closed xjywhu closed 5 years ago

xjywhu commented 5 years ago

We have used the dataset which is 5 turns per line, and the num of gpus is 2, however the result of the command line is number of gup 0, and the speed of the training is very low. We estimate that it will take one month to complete the training of the model. I want to ask whether the speed of training is normal and how to accelerate the training. image

xjywhu commented 5 years ago

Can the model support Chinese?

ehsk commented 5 years ago

Thanks for your interest in our work.

Based on what I'm seeing in the screenshot you sent, seems you have installed the CPU version of TensorFlow because no GPUs are detected in the list of devices. You can test this by running pip freeze (or conda list if you're using Anaconda) and checking which TensorFlow has been set up. You can also check the devices in the Python shell:

from tensorflow.python.client import device_lib
device_lib.list_local_devices()

The above function returns all the visible devices. Please make sure that the GPUs are listed in the output.

The training on the 3-turn data takes about 2 weeks to complete using 2 GPUs.

All models presented here are language independent. As long as the data format is ok, any data can be fed to this framework.

xjywhu commented 5 years ago

Thank you for replying my question. And I want to ask when the pretrained model would be presented on GitHub.

ehsk commented 5 years ago

We will release the pre-trained models very soon. Stay tuned!