Closed pavelzw closed 1 year ago
with a bunch of isinstance checks
The question is how many isinstance checks we should do...
It should work with RandomForestRegressor
, DecisionTreeRegressor
, Pipeline
but also custom regressors that contain a Tree
somewhere inside them... I don't think that we will be able to catch all cases by just calling something on the outer object.
Another option would be to just edit Tree
and Booster
in the dispatch table s.t. the pickler
can handle everything but this comes with the drawback of scikit-learn
and lightgbm
as necessary dependencies.
(benchmark 5091527615 / attempt 1) Base results / Our results / Change | Model | Size | Dump Time | Load Time |
---|---|---|---|---|
sklearn rf 20M | 20.8 MiB / 3.0 MiB / 6.87 x | 0.03 s / 0.04 s / 1.33 x | 0.03 s / 0.03 s / 0.91 x | |
sklearn rf 20M lzma | 6.5 MiB / 2.0 MiB / 3.26 x | 12.35 s / 1.23 s / 0.10 x | 0.56 s / 0.19 s / 0.33 x | |
sklearn rf 200M | 212.3 MiB / 30.6 MiB / 6.94 x | 0.22 s / 0.33 s / 1.53 x | 0.26 s / 0.36 s / 1.38 x | |
sklearn rf 200M lzma | 47.5 MiB / 14.6 MiB / 3.25 x | 100.38 s / 19.79 s / 0.20 x | 4.48 s / 1.45 s / 0.32 x | |
sklearn rf 1G | 1157.5 MiB / 166.8 MiB / 6.94 x | 1.35 s / 1.68 s / 1.24 x | 1.63 s / 1.78 s / 1.09 x | |
sklearn rf 1G lzma | 257.1 MiB / 98.2 MiB / 2.62 x | 507.21 s / 108.81 s / 0.21 x | 23.86 s / 8.73 s / 0.37 x | |
sklearn gb 2M | 2.2 MiB / 1.1 MiB / 2.08 x | 0.03 s / 0.24 s / 7.50 x | 0.08 s / 0.17 s / 2.24 x | |
sklearn gb 2M lzma | 0.6 MiB / 0.2 MiB / 3.82 x | 0.97 s / 0.44 s / 0.45 x | 0.09 s / 0.14 s / 1.60 x | |
lgbm gbdt 2M | 2.6 MiB / 1.0 MiB / 2.78 x | 0.09 s / 0.26 s / 2.81 x | 0.01 s / 0.15 s / 12.03 x | |
lgbm gbdt 2M lzma | 0.9 MiB / 0.5 MiB / 1.90 x | 1.40 s / 0.53 s / 0.38 x | 0.08 s / 0.23 s / 2.81 x | |
lgbm gbdt 5M | 5.3 MiB / 1.9 MiB / 2.81 x | 0.19 s / 0.51 s / 2.77 x | 0.03 s / 0.32 s / 12.18 x | |
lgbm gbdt 5M lzma | 1.7 MiB / 0.8 MiB / 1.96 x | 3.85 s / 1.10 s / 0.29 x | 0.15 s / 0.39 s / 2.55 x | |
lgbm gbdt 20M | 22.7 MiB / 7.6 MiB / 3.00 x | 0.75 s / 2.14 s / 2.85 x | 0.12 s / 1.30 s / 11.20 x | |
lgbm gbdt 20M lzma | 6.3 MiB / 3.0 MiB / 2.09 x | 19.14 s / 5.23 s / 0.27 x | 0.60 s / 1.55 s / 2.57 x | |
lgbm gbdt 100M | 101.1 MiB / 33.0 MiB / 3.06 x | 3.27 s / 9.58 s / 2.93 x | 0.53 s / 55.34 s / 105.15 x | |
lgbm gbdt 100M lzma | 25.6 MiB / 10.6 MiB / 2.41 x | 99.62 s / 26.12 s / 0.26 x | 2.58 s / 7.01 s / 2.71 x | |
lgbm rf 10M | 10.9 MiB / 3.2 MiB / 3.46 x | 0.37 s / 0.70 s / 1.86 x | 0.05 s / 0.55 s / 12.04 x | |
lgbm rf 10M lzma | 0.7 MiB / 0.4 MiB / 1.85 x | 2.11 s / 0.98 s / 0.47 x | 0.12 s / 0.62 s / 4.93 x |
I agree. It’s not worth the effort. The current interface is easy enough to use.
Fixes #48