google-research / bert

TensorFlow code and pre-trained models for BERT
https://arxiv.org/abs/1810.04805
Apache License 2.0
37.78k stars 9.56k forks source link

Scaling extract_features #397

Open simra opened 5 years ago

simra commented 5 years ago

I'm looking for a few pointers on how to efficiently scale up extract_features. Unlike training, there isn't a lot of information on distributed prediction out there- I'd like to try one or more of the following:

  1. Specify the GPU device that I'm using, so I can solve it via multiple processes on a multi-GPU machine. It's not clear to me how the GPU device can be specified via estimator config.
  2. Better, automatically scale out estimator.predict() to utilize all available GPU devices.
  3. Re-use partial inputs. For example, if I'm featurizing query-passage pairs, pre-load the query portion of the input tensor while iterating through passages.

Any suggestions or pointers would be most appreciated. Feel free to redirect me to stackoverflow if that's a better venue to answer these questions.

hanxiao commented 5 years ago

maybe you are looking for bert-as-service https://github.com/hanxiao/bert-as-service/ it's a highly scalable feature extraction service based on BERT.

simra commented 5 years ago

Thanks I will try this.