Closed xuehy closed 7 years ago
Hi @xuehy Extremely sorry for the late response! I somehow missed this issue. To answer your question, it will be non-trivial for me to support dynamic batching for tree-LSTMs. This type of model, i.e. a recursive neural network, will have a different structure for each input sample, and it is extremely difficult (almost impossible) to find samples which might result in almost similar structures, so that we can batch them together. However, all is not lost, as there are a few options that can be explored:
SPINN: This model enables a hybrid tree-sequence architecture, blending the otherwise separate paradigms of recursive and recurrent neural networks. Links to the original blog post, and to PyTorch sample code
Tensorflow Fold: The first couple of paragraphs in the README are pretty self-explanatory.
Hogwild Training: I do have some plans to make it possible to train this code on multiple GPUs/CPUs using some sort of hogwild scheme. There is a PyTorch example available, which you can try to adapt.
Thanks for your detailed explanations! I am following SPINN's idea to make the network partially batched.
This implementation can only process one sample at a time. The performance is limited since the usage of the GPU is low. Is there possibility to make treelstm support dynamic batching such that the GPU can be fully utilized?