allenai / deep_qa

A deep NLP library, based on Keras / tf, focused on question answering (but useful for other NLP too)
Apache License 2.0
404 stars 133 forks source link

Adaptive batch sizes #322

Closed matt-gardner closed 7 years ago

matt-gardner commented 7 years ago

Because we're doing dynamic padding, it makes sense to vary the batch size to use the GPU memory optimally - when the padding lengths are small, we can increase the batch size, and when they're long, we can decrease the batch size.

The simplest thing to do is to expose a method to subclasses that lets them split the data into batches after sorting. The concrete model class can them specify some heuristics for how many instances will fit in a batch based on how large they are.

A much more exciting thing to do, but also probably close to impossible, is to have the library just figure out how many instances can go in each batch, by examining the computation graph, or something. Not at all sure how to do this.

matt-gardner commented 7 years ago

332