Open biotech25 opened 7 years ago
Hi Sanghoon.
Doing tnh <- tan_hc
, as you correctly say, uses cross-validation to learn its structure, by evaluating the accuracy of different candidate structures. On the other hand, tnc <- tan_cl
learns its structure and parameters using the full data set, with an option to use Bayesian parameter estimation, which is more 'robust' than maximum likelihood.
Yet, when you call predict(tnh, bndata)
or predict(tnc, bndata)
there is no cross-validation involved: you get the probabilities for bndata
using a model (tnh
or tnc
) that has already been learned. If want you want is to learn k
models from a training subset of your data, and use it to predict the remaining test data, then neither tan_hc
nor tan_cl
are doing the trick. In both cases, when you use predict
, the model has already been learned.
To achieved that, as you say, you would have to start from the cv
function. In particular, I think that the function you should look at is `update_assess_fold'.
Best, Bojan
Thank you so much for your explanation. As I read your explanation, I realized that I misunderstood that cross-validation was involved in 'predict(tnh, bndata)'. I was wrong.
As you recommended, I am looking at the source codes of cv' and 'update_assess_fold' functions. I found that, even in the 'cv' source code, there are many functions I need to call. For example, there is 'ensure_multi_list' and 'get_common_class' functions in the beginning of 'cv' source code. Also, 'ensure_multi_list' needs to call 'is_just' function, so I am finding all source codes of functions at github. (I am not sure whether I am doing correctly)
I will endeavor more and let you update. Thank you so much.
Sanghoon
Hi,
Could you help me one more time?
I want to run cross validation in 'tan_cl' function to learn structure and to get predicted class label and probabilities for each label.
As we know, 'tan_hc' function runs cross-validation; tn<-tan_hc("class", car, k = 10, epsilon = 0, smooth = 1) From this cross-validation output, I can use 'predict' function and get predicted class label and probabilities; predict(tn, bndata, prob = TRUE)
However, 'tan_cl' function doesn't have cross-validation function; tn <- tan_cl('class', car, score = 'aic') After this 'tan_cl' function, if I use 'predict' function, the predicted class label and probabilities are not from cross-validation. The best way will be tweaking 'tan_cl' source code to insert cross-validation function, like the 'tan_hc' source code.
Yes, there is 'cv' function; cv(tn, car, k=10) but this is to get prediction accuracy. So, it doesn't solve my problem. I looked at the source code of 'cv' as I might be able to insert cross-validation function and obtain the predicted class label and probabilities. However, the source code is not fully provided. For example, I can't run 'ensure_multi_list' function inside 'cv' function.
Could you help me about this? Thank you,
Sanghoon