can it use multi gpus to train ？

Serenade-J commented 4 years ago

Yes.
My method is using tf.contrib.distribute in tensorflow-gpu 1.13 I met some problems with this method, spent several days and finally successfully train PIE on multi GPUs. So you can use other methods if you find them convenient. Below is part of my code in word_edit_model.py, coping them may lead bugs because they are not the whole codes I changed in PIE.

from tensorflow.python.estimator.run_config import RunConfig
from tensorflow.python.estimator.estimator import Estimator
from tensorflow.contrib.distribute import AllReduceCrossDeviceOps
# ...
dist_strategy = tf.contrib.distribute.MirroredStrategy(
      num_gpus=FLAGS.n_gpus,
      cross_device_ops=AllReduceCrossDeviceOps('nccl', num_packs=FLAGS.n_gpus),
      # cross_device_ops=AllReduceCrossDeviceOps('hierarchical_copy'),
  )
  session_config = tf.ConfigProto(
      inter_op_parallelism_threads=0,
      intra_op_parallelism_threads=0,
      allow_soft_placement=True,
      gpu_options=tf.GPUOptions(allow_growth=True))

  run_config = RunConfig(
      train_distribute=dist_strategy,
      eval_distribute=dist_strategy,
      model_dir=FLAGS.output_dir,
      session_config=session_config,
      save_checkpoints_steps=FLAGS.save_checkpoints_steps,
      keep_checkpoint_max=15,
      )

Serenade-J commented 4 years ago

All the documents I referred to (for training PIE with multi GPUs) can be found online.

binhetech commented 4 years ago

All the documents I referred to (for training PIE with multi GPUs) can be found online.

Hi, could you share the whole codes you changed in word_edit_model.py? Thanks.

awasthiabhijeet / PIE

can it use multi gpus to train ？ #8