Saber is a deep-learning based tool for information extraction in the biomedical domain. Pull requests are welcome! Note: this is a work in progress. Many things are broken, and the codebase is not stable.
Previously, only one of k_folds or validation_split could be used, with k_folds taking precedence over validation_split. Now, they can be used together.
To use k-fold cross-validation, provide only a train.* file under dataset_folder. Choose the number of folds with the k_folds argument. Optionally, specify a proportion of examples to hold-out at random in each fold as a validation set with validation_split.
To create a validation split from the train set, provide a train.* file (and optionally, a test.* file) under dataset_folder and specify the proportion of training examples to hold-out for a validation set with validation_split.
Otherwise, provide the partitions yourself with the files train.*, valid.* and test.* under dataset_folder and leave k_folds and validation_split equal to 0.
k_folds will be ignored if either a valid.* or test.* file is found under dataset_folder. Both arguments k_folds and validation_split will be ignored if a valid.* file is found under dataset_folder.
Overview
Previously, only one of
k_folds
orvalidation_split
could be used, withk_folds
taking precedence overvalidation_split
. Now, they can be used together.train.*
file underdataset_folder
. Choose the number of folds with thek_folds
argument. Optionally, specify a proportion of examples to hold-out at random in each fold as a validation set withvalidation_split
.train.*
file (and optionally, atest.*
file) underdataset_folder
and specify the proportion of training examples to hold-out for a validation set withvalidation_split
.Otherwise, provide the partitions yourself with the files
train.*
,valid.*
andtest.*
underdataset_folder
and leavek_folds
andvalidation_split
equal to0
.E.g.
TODOs
Closes
Closes #154.