Open matt-gardner opened 7 years ago
With the batch parallelism PR merged, I'm renaming this issue to focus on the one remaining thing: I believe that models can currently use model parallelism if you want, by using device scopes. Making sure this works and providing some documentation for it would be nice, but not super high priority.
I think that the more important aspect of parallelism still left is to get it working with the various types of data generators/padding stuff we have, rather than model parallelism, but yeah in general it would be nice to double check that this works as smoothly as it might do.
Agreed, hence the P2.
With dropping theano support, it should be easy to make our models use multiple GPUs, not just with batch parallelism, and to put some parts of the model on the CPU (e.g., the embedding layer, as recommended by Matt Peters). I think this is pretty straightforward, but I haven't done it before. We should:
TextTrainer
.