This R package provides scoring mechanisms for computational challenges and implements the bayesBootLadderBoot approach for avoiding test data leakage.
Apache License 2.0
3
stars
0
forks
source link
refactor package to make it easier to reuse scoring functions directly in challenge pipeline #9
challengescoring functions were written with post-challenge analysis in mind.
most challengescoring score functions are written as score_fun(prediction_vector, gold_vector)
most challenge pipeline scoring functions are written as score_fun(path_to_pred, path_to_gold)
It would be much easier if the scoring functions were easy to both bootstrap as well as call package functions in the challenge pipeline.
However, a small challenge with this is that the scoring functions are called 1000's of times when bootstrapping, so if they are reading in the csv each time, that could be a problem. Additionally, for bootstrapping, the resampling needs to be paired across all prediction files tested, so the resampling step needs to happen outside the scoring function, not inside. This is how it is currently implemented, but would be harder if the scoring function is reading from a path rather than a list of resampled prediction dfs.
challengescoring functions were written with post-challenge analysis in mind.
most challengescoring score functions are written as score_fun(prediction_vector, gold_vector)
most challenge pipeline scoring functions are written as score_fun(path_to_pred, path_to_gold)
It would be much easier if the scoring functions were easy to both bootstrap as well as call package functions in the challenge pipeline.
However, a small challenge with this is that the scoring functions are called 1000's of times when bootstrapping, so if they are reading in the csv each time, that could be a problem. Additionally, for bootstrapping, the resampling needs to be paired across all prediction files tested, so the resampling step needs to happen outside the scoring function, not inside. This is how it is currently implemented, but would be harder if the scoring function is reading from a path rather than a list of resampled prediction dfs.