tensorflow / decision-forests

A collection of state-of-the-art algorithms for the training, serving and interpretation of Decision Forest models in Keras.
Apache License 2.0
663 stars 110 forks source link

Please support threads or processes #39

Closed Howard-ll closed 3 years ago

Howard-ll commented 3 years ago

background If GPU support is difficult (and takes long), multiple threads or processes can speed up inference as well

feature request Could you support a parameter like n_jobs?

image

achoum commented 3 years ago

Hi,

Thanks for the ping and feature request.

Regarding multi-threading

The training code of the RF and GBT is already multi-threaded. The number of threads is 6 by default.

This number of threads can be configured with the num_threads field in the deployment spec of the advanced argument.

However, this is feels like an obscure parameter, so I'll make sure to surface it.

Feature request: Move the num_threads argument to the model's object constructor.

Multi-process

Multi-process training (e.g. training on 100s of machines) will be released in mid Q3 (probably ~ end September).

Cheers,

achoum commented 3 years ago

Training multi-threading is now available with the num_threads model constructor argument, e.g.:

model = tfdf.keras.GradientBoostedTreesModel(num_threads=40)

Inference is still done in a single thread.

Note that the inference code is thread safe. Therefore, you can call it from multiple threads at the same time.