allenai / deep_qa

A deep NLP library, based on Keras / tf, focused on question answering (but useful for other NLP too)
Apache License 2.0
404 stars 133 forks source link

Allow for non-Keras optimizers #315

Closed matt-gardner closed 7 years ago

matt-gardner commented 7 years ago

Using plain tensorflow optimizers instead of Keras optimizers is beneficial in some instances (particularly when you have a large embedding matrix, as Matt Peters has discovered). We should split out the actual training loop into something configurable, to allow for different means of optimizing the same computation graph. You can still use _build_model just like we normally do, you just pull out the inputs and outputs and pass them directly into a tensorflow optimizer instead of calling model.fit().

matt-gardner commented 7 years ago

Note that if we do this, you could also conceivably use something like PyTorch to build your model, and just have a PyTorch optimizer. The only real benefit there would be to re-use the data processing code, though - you couldn't use any of the layers we have, or the higher-level TextTrainer API.