metrics.semantics.SemanticMetric type is added from which we have 2 implementations: BertScore and BartScore
two new basic metrics are added:
metrics.basics.ExactMatchMetric: Flattens all the references (in the multi-reference scenario) and performs an exact match
metrics.basics.ConfusionMatrix: Generates confusion matrix for the provided set of labels (obtained from both references and predictions) using scikit-learn
initial unit tests (based on pytest) are added at tests/. (For now, it only tests for the format_to_jury function (it's very basic for now).
Minor Changes
models.defaults.DefaultQAModelWrapper now accepts the device param to run on CPU or GPU
models._base.HFPipelineWrapper has a pipeline property that returns the pipeline
evalem.misc.datasets.get_squad_v2(...) utility function is added to load squad-v2 dataset
Major Changes
metrics.semantics.SemanticMetric
type is added from which we have 2 implementations:BertScore
andBartScore
metrics.basics.ExactMatchMetric
: Flattens all the references (in the multi-reference scenario) and performs an exact matchmetrics.basics.ConfusionMatrix
: Generates confusion matrix for the provided set of labels (obtained from both references and predictions) using scikit-learntests/
. (For now, it only tests for theformat_to_jury
function (it's very basic for now).Minor Changes
models.defaults.DefaultQAModelWrapper
now accepts thedevice
param to run on CPU or GPUmodels._base.HFPipelineWrapper
has apipeline
property that returns the pipelineevalem.misc.datasets.get_squad_v2(...)
utility function is added to load squad-v2 datasetUsages