Closed samFarrellDay closed 2 years ago
I agree! Would be usefull. How would you currently save/load kernels?
Kernels can be saved and loaded with the dill
package. I have found that the pickle
package doesn't work. Dill does a much better job of keeping track of object definitions and import requirements, even if they are nested and hidden away inside methods.
Kernels can be saved and loaded with the
dill
package. I have found that thepickle
package doesn't work. Dill does a much better job of keeping track of object definitions and import requirements, even if they are nested and hidden away inside methods.
Thanks for the quick response Sam! That really helps.
Might it be possible to add this to the introduction of the package description? I've gone through alot of documentation because of that sentence and the examples didn't show a save/load step. (e.g. "Kernels can be saved (recommended using the dill package) and impute new, unseen datasets. Imputing new data is often orders of magnitude faster than including the new data in a new mice procedure. Imputation models can be built off of a kernel dataset, even if there are no missing values. New data can also be imputed in place.
@SjoerdBraaksma You might be pleased to see a save_kernel() method has been added in 5.4.0. It uses parquet and byte compression to make the save file as small as possible.
Might be able to take advantage of parquet/feather or joblib to compress the working_data well beyond what normal byte compression is capable of.