chim3y commented 4 years ago

I am trying to use mlr with batchtools to conduct benchmarking. In order to generate a prediction on the benchmarked results from batchtools, I need to retrieve the benchmark results. Using reduceResultsList doesn't work with mlr's generateThreshVsPerfData. Is there another function to retrieve the results? Because results from reduceResultsList() is not acceptable in MLR to generate ROC curve.

retrieve benchmark result

result = reduceResultsList(ids = c(c(1:284), c(286:318)), reg = regis, missing.val = NA) Reducing [===================================================================================================>] 100% eta: 0s>

df = generateThreshVsPerfData(result, measures = list(fpr, tpr, mmce)) Error in generateThreshVsPerfData.list(result, measures = list(fpr, tpr, : Assertion on 'obj' failed: May only contain the following types: Prediction,ResampleResult.

mllg commented 4 years ago

If you want to do parallel benchmarking, there is the batchmark() function in mlr. You can then call reduceBatchmarkResults() to combine the results of the individual jobs into a regular benchmark object.

Does that help?

chim3y commented 4 years ago

Thank you so much for your time and suggestion. Unfortunately, it generated the following error:

r=submitJobs(ids = 1:208, reg = regis) #small datasets Submitting 208 jobs in 208 chunks using cluster functions 'Interactive' ...

result=reduceBatchmarkResults() Error: object of type 'closure' is not subsettable In addition: Warning message: In reduceBatchmarkResults() : Collecting results for a subset of jobs. The resulting BenchmarkResult may be misleading. result=reduceBatchmarkResults(reg=regis) Error: object of type 'closure' is not subsettable In addition: Warning message: In reduceBatchmarkResults(reg = regis) : Collecting results for a subset of jobs. The resulting BenchmarkResult may be misleading.

I needed benchmark results to retrieve the prediction to be used in generateThreshVsPerfData() and then generate ROC curve. In my scenario, i have 318 datasets. I tried the following approach to retrieve the prediction from the benchmark results and plot roc curve:

2.1 Conversion of the benchmark results

regis = loadRegistry("Data/Results/Batchtools/batchtool_experiment", writeable = TRUE) Reading registry in read-write mode No readable configuration file found result = reduceResultsList(ids = c(c(1:284), c(286:318)), reg = regis, missing.val = NA) Reducing [=========================================================================>] 100% eta: 0s> df=generateThreshVsPerfData(result[[1]][["results"]][["rabe_166"]][["classif.logreg"]]$pred,measures= list(fpr,fnr, mmce)) plotROCCurves(df) df=generateThreshVsPerfData(result[[1]][["results"]][["rabe_166"]][["classif.logreg"]]$pred,measures= list(fpr,fnr, mmce)) plotROCCurves(df) df=generateThreshVsPerfData(result[[1]][["results"]][["rabe_166"]][["classif.logreg"]]$pred,measures= list(fpr,fnr, mmce)) plotROCCurves(df) df1=generateThreshVsPerfData(result[[1]][["results"]][["elusage"]][["classif.logreg"]]$pred,measures= list(fpr,fnr, mmce)) plotROCCurves(df1)

However, I cannot plot all the dataset's prediction on one curve. Please, can you able to suggest me another approach whereby I can retrieve all the prediction from the benchmark results and plot roc curve.

mllg / batchtools

How to retrieve benchmark results from makeExperimentRegistry() #249

retrieve benchmark result

2.1 Conversion of the benchmark results