Closed robertfeldt closed 2 years ago
Hi, yes we do have option for learning from missing data. There is two main options
We have learn_circuit_miss
, as counterpart to learn_circuit
, which both do structure learning. Some examples here.
HCLT structures. This is not fully documented yet, but here's a quick example. The learning is done in two steps, first learn a hidden chow liu tree (HCLT) structure, more detail on HCLTs , for this part we need to impute the data at moment:
# X_train
# train_imputed = train data with missing values imputed
num_hidden_cats = 32
num_clt_trees = 1
circuit = hclt(num_features(X_train); data = train_imputed, num_hidden_cats = num_hidden_cats, num_trees = num_clt_trees)
uniform_parameters(circuit; perturbation = 0.4)
Step 2: Learn paramters using EM using estimate_parameters_em_multi_epochs!
More info here .
This is great, thanks. Sorry, I had missed that (learn_circuit_miss
) part of the documentation. Will check and experiment.
Again, thanks for Juice, it's already very useful and the potential is fantastic.
Is there some way to learn also with incomplete information, i.e. where the values of a few features are missing for some entries? I tried playing around with a few different input types/formats where some value is missing but seems the call to
learn_circuit
invariably fails.