bmihaljevic / bnclassify

Learning Discrete Bayesian Network Classifiers from Data
18 stars 10 forks source link

tan_hc, tan_hcsp - output for prediction probability and predicted class label #25

Open biotech25 opened 7 years ago

biotech25 commented 7 years ago

Hi,

I appreciate your work and publishing your code. I am using tan_cl, tan_hc, and tan_hcsp. "tan_cl" provides prediction probability and predicted class label in output file through

tn <- bnc('tan_cl', "class", car, smooth=parameter, dag_args=list('aic')) predict(tn, car, prob = TRUE) predict(tn, car, prob = FALSE)

However, it seems like "tan_hc" and "tan_hcsp" don't have a function to output prediction probability and predicted class label. Do you have a function to output the probability and class label?

Otherwsie, could you provide source code of "tan_hc" and "tan_hcsp" for me? (I couldn't find the source codes in GitHub) Then, I will try to tweak the code to produce the prediction probability and predicted class label. My email address is data.biotech25@gmail.com

Thank you, Sanghoon

ghost commented 7 years ago

Hi Sanghoon, thanks for you input. Both "tan_hc" and "tan_hcsp" should work in the same way as "tan_cl", with the predict function. Are you getting some error when trying that? You can find the function for tan_hc in learn-struct.R and go from there, but you should not need to modify anything; it should be working already.

biotech25 commented 7 years ago

Thank you for your reply. Yes, I expected that 'tan_hc' work in the same way as 'tan_cl', but I got an error message. Please see below.

inFilePath="TestInput.txt" targetVariable="metastatic_tumor_indicator" MetabricData=read.table(inFilePath, stringsAsFactors=T, header=T, row.names=1, sep="\t", check.names=F)

numCol=1:ncol(MetabricData) MetabricData[,numCol]=data.frame(apply(MetabricData[numCol],2,as.factor))

interesting thing is that I had to factorize the input data to run

'tan_cl' and 'tan_hc' tn<-tan_hc(targetVariable, MetabricData, k = 5, epsilon = 0, smooth = 1) tn Bayesian network classifier (only structure, no parameters)

class variable: metastatic_tumor_indicator num. features: 24 num. arcs: 47 learning algorithm: tan_hc

prob <- predict(tn, MetabricData, prob = TRUE)

  • Error in UseMethod("predict") : *
  • no applicable method for 'predict' applied to an object of class "bnc_dag"*

prob <- predict(tn, MetabricData, prob = FALSE)

  • Error in UseMethod("predict") : *
  • no applicable method for 'predict' applied to an object of class "bnc_dag"*

Thank you so much, Sanghoon

On Tue, Nov 15, 2016 at 6:53 AM, Bojan Mihaljevic notifications@github.com wrote:

Hi Sanghoon, thanks for you input. Both "tan_hc" and "tan_hcsp" should work in the same way as "tan_cl", with the predict function. Are you getting some error when trying that? You can find the function for tan_hc in learn-struct.R and go from there, but you should not need to modify anything; it should be working already.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/bmihaljevic/bnclassify/issues/25#issuecomment-260622011, or mute the thread https://github.com/notifications/unsubscribe-auth/APBzDVTWiQERh_ziH8m-p50771BkYIsOks5q-Z1VgaJpZM4KwLjY .

ghost commented 7 years ago

This is because functions such as tan_hc and tan_cl only return the network structure, without parameters. bnc(), on the other hand, learns both structure and parameters, and therefore you are able to predict with the object that it returns. So, one options is to get 'tan_hc' via bnc: call bnc('tan_hc', ...). Another is to, after calling tan_hc, use lp to learn the parameters. In your case, it could be something like: t <- lp(t, MetabricData, smooth = 0.01).

Let me know how it goes.

ghost commented 7 years ago

And yes, indeed, bnclassify requires the input variables to be coded as factors.

biotech25 commented 7 years ago

Thank you so much for your help. " t <- lp(t, MetabricData.... )" works. As you explained, it seems that the 'lp' function learns parameters. Please see below the code that I proceeded.

tn<-tan_hc(targetVariable, MetabricData, k = 5, epsilon = 0, smooth = 1) tn_lp <- lp(tn, MetabricData, smooth=1) prob <- predict(tn_lp, MetabricData, prob = FALSE)

However, when I used 'bnc' function directly, I got an error. It is strange. The 'bnc' funciton works for 'tan_cl', but it doesn't work for 'tan_hc' and 'tan_hcsp'. Please see below.

tn <- bnc('tan_hc', targetVariable, MetabricData, k = 5, smooth = 1) Error in bnc("tan_hc", targetVariable, MetabricData, k = 5, smooth = 1) : unused argument (k = 5)

I appreciate your help. Sanghoon

On Tue, Nov 15, 2016 at 11:41 AM, Bojan Mihaljevic <notifications@github.com

wrote:

This is because functions such as tan_hc and tan_cl only return the network structure, without parameters. bnc(), on the other hand, learns both structure and parameters, and therefore you are able to predict with the object that it returns. So, one options is to get 'tan_hc' via bnc: call bnc('tan_hc', ...). Another is to, after calling tan_hc, use lp to learn the parameters. In your case, it could be something like: t <- lp(t, MetabricData, smooth = 0.01).

Let me know how it goes.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/bmihaljevic/bnclassify/issues/25#issuecomment-260694751, or mute the thread https://github.com/notifications/unsubscribe-auth/APBzDbutAI2qER1_m_3fSNQ384RuHqqtks5q-eDUgaJpZM4KwLjY .