Closed mowoe closed 6 months ago
Hi, thank you for reporting - 30 minutes sound way too long, so I want to investigate this further. While I look into this more closely, can you please give me
ydf.verbose(2)
before loading the modelHi @rstz , thanks for the fast response. The hyperparameters used for training were
{
"categorical_algorithm": "CART",
"max_depth": "16",
"min_examples": "30",
"num_trees": "200",
"sparse_oblique_normalization": "MIN_MAX",
"sparse_oblique_projection_density_factor": "8.0",
"split_axis": "SPARSE_OBLIQUE",
}
However, i have trained this model as part of a grid search across a large number of hyperparameters and from what i can see, all models take this long to load.
The features are floats in the range [0;1]
, effectively only taking on the values {0,0.25,0.5,0.75,1}
, however i suppose that doesnt make a difference.
The model dir looks like this size-wise:
306M ./assets
4,0K ./fingerprint.pb
496K ./keras_metadata.pb
6,6M ./saved_model.pb
8,0K ./variables
I havent tried to train the model with YDF directly, i can try to do that if you think that this might improve performance. Training the model with tfdf took ~72h (596400 examples and 1191 features).
Thank you! Instead of re-training the model, you can also load it in with YDF, then save it with YDF to a new directory and measure the time to re-load it
loaded_model = ydf.from_tensorflow_decision_forests(model_path)
loaded_model.save("/tmp/my_model")
start_time = time.time()
re_loaded_model = ydf.load_model("/tmp/my_model")
end_time = time.time()
elapsed_time = end_time - start_time
print("Elapsed time:", elapsed_time, "seconds")
That might help me narrow down if the issue is with the importer (which is mostly written in Python and is, presumably, not very fast) or somewhere in C++
Of course, if it's possible to share the model, feel free to do so - but I know that this is often not possible
Okay, thanks.
As i said from_tensorflow_decision_forests
doesnt seem to load the model fully and neither does ydf.load_model
.
The execution time for the script you provided is ~2 seconds, however, as soon as i try to use the loaded model (i.e. add a prediction to the script), it seems to actually load it and then does the prediction, which takes ~30 minutes. Subsequent predictions take less then 10 seconds.
I am unable to share this exact model, but i can provide a different one with similar properties suffering from the same issue:
https://www.comet.com/api/registry/model/item/download?modelItemId=bV0izwCSlrbc8ZpJJpiCfIrbp
Here is a csv with some sample data which you can use to make a prediction sample_data.csv
This should behave the exact same, let me know if something doesnt work.
Okay, i just noticed that the provided model does, in fact, not suffer from the same issue. So here is the actual model im having the issue with instead: [removed]()
Thanks, I'll take a closer look and report back
Hi, quick update. I'm fairly sure I found the issue - the combination of hyperparameters you're using really creates very slow creation of the prediction engine (which, in Python, happens before the first call to predict).
Good news is that we can probably make this ~10 times faster fairly easily (on your model, I got 15 minutes loading time down to 1.6 minutes with a quick prototype). Now all I need to do is the usual software engineering - validate it, make sure the fix has a reasonable design, testing, releasing, ...
That sounds awesome, thank you so much!
Hi,
i am trying to deploy a trained ydf model (
tfdf.keras.RandomForestModel
) to a serverless FaaS-like inference service in order to save resources (only 2-3 invocations per day). For this, the model needs to be loaded basically every time a prediction is done. However, loading the model takes more than 30 minutes, which is too long for this usecase. I am not a c++ dev, so my debugging efforts have been relatively limited up to now. Initially i only used the tfdf library, since then i have switched to the ydf python library, with no performance increase (which is probably to be expected since the use the same c++ backend). The only difference is, that the ydf library does not seem to actually load the model fully when callingbut rather only when the first predict call is made.
The code
is producing
As you can see, it takes ~30 minutes until the model is fully loaded. This happens on my local machine (M2 Air), but also on all other machines i tried (including x86 ones). If you think that a model with these parameters is supposed to take this long to load, feel free to close this issue, but to me this seems a bit too slow judging by the fact that it is possible to load gigantic models like llama in sub 10 seconds.
Thanks in advance!