Open alvations opened 5 years ago
We didn't try to train the model with multiple GPUs. Maybe you need to rewrite the code for LinearDropConnect
function
Another question, it seems no speed up using GPU compared with CPU, have you met the same problem? Both take 280-290 s each epoch
@yikangshen @shawntan Is there an easy way to train the model to replicate the experiments using main.py using multiple GPUs?
When using
model = nn.DataParallel(model)
beforetrain()
, the initialization goes into the LSTM stack and then the ONLSTM cell to return the weights but it throws an error.We also tried doing the
model = nn.DataParallel(model)
after thehidden = model.init_hidden(args.batch_size)
and it seems like theLinearDropConnect
layer can't access the.weight
tensors.
Hi, Just want to know have you figured this out? Best
@yikangshen @shawntan Is there an easy way to train the model to replicate the experiments using main.py using multiple GPUs?
When using
model = nn.DataParallel(model)
beforetrain()
, the initialization goes into the LSTM stack and then the ONLSTM cell to return the weights but it throws an error.We also tried doing the
model = nn.DataParallel(model)
after thehidden = model.init_hidden(args.batch_size)
and it seems like theLinearDropConnect
layer can't access the.weight
tensors.