The current style of storing the models as private variables of a class in __init__, and reuse them in functions. This leads to trouble as the scheduler gets locked up frequently and also fails for moderately sized datasets.
The current hack is to read the model from disk every single time, but its inefficient and makes the program slow.
Dask's recommended strategy is to wrap the model in delayed and pass it to the function each time. Will need to try that and properly benchmark compute.
The current style of storing the models as private variables of a class in
__init__
, and reuse them in functions. This leads to trouble as the scheduler gets locked up frequently and also fails for moderately sized datasets.The current hack is to read the model from disk every single time, but its inefficient and makes the program slow.
Dask's recommended strategy is to wrap the model in delayed and pass it to the function each time. Will need to try that and properly benchmark compute.