Closed nicodv closed 4 years ago
Merging #496 into master will increase coverage by
8.20%
. The diff coverage is100.00%
.
@@ Coverage Diff @@
## master #496 +/- ##
==========================================
+ Coverage 78.83% 87.04% +8.20%
==========================================
Files 345 345
Lines 11706 11741 +35
Branches 371 392 +21
==========================================
+ Hits 9229 10220 +991
+ Misses 2477 1521 -956
Impacted Files | Coverage Δ | |
---|---|---|
...m/salesforce/op/evaluators/EvaluationMetrics.scala | 87.50% <ø> (+18.75%) |
:arrow_up: |
...lesforce/op/evaluators/OpRegressionEvaluator.scala | 97.87% <100.00%> (+6.20%) |
:arrow_up: |
...tages/impl/preparators/SanityCheckerMetadata.scala | 89.86% <0.00%> (+0.67%) |
:arrow_up: |
...rce/op/stages/impl/preparators/SanityChecker.scala | 90.57% <0.00%> (+1.22%) |
:arrow_up: |
...p/stages/impl/selector/SelectedModelCombiner.scala | 93.50% <0.00%> (+1.29%) |
:arrow_up: |
...com/salesforce/op/utils/stages/FitStagesUtil.scala | 94.73% <0.00%> (+1.31%) |
:arrow_up: |
...n/scala/com/salesforce/op/dsl/RichMapFeature.scala | 67.64% <0.00%> (+1.47%) |
:arrow_up: |
.../main/scala/com/salesforce/op/OpWorkflowCore.scala | 95.45% <0.00%> (+1.51%) |
:arrow_up: |
...n/scala/com/salesforce/op/readers/DataReader.scala | 95.23% <0.00%> (+1.58%) |
:arrow_up: |
...orce/op/aggregators/MonoidAggregatorDefaults.scala | 100.00% <0.00%> (+1.78%) |
:arrow_up: |
... and 71 more |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 964b58e...04e0809. Read the comment docs.
@leahmcguire , I considered that but OpWorkflowModel.scoreAndEvaluate()
takes a single evaluator currently. I chose to add it to the existing OpRegressionEvaluator
so that the metrics are broadly available (e.g. in our evaluation), not just in the RegressionModelSelector
.
Let me know if you think we should reconsider that trade-off.
Related issues N/A
Describe the proposed solution Adds a histogram of signed (as opposed to more common definition of absolute) error percentages to
OpRegressionEvaluator
. A couple of parameters are exposed to control the behavior of the calculation of this histogram, related to how to deal with label values around 0, as that would explode the percentages.signedPercentageErrorHistogramBins
, an array determining the histogram binsscaledErrorCutoff
: a label value cutoff below which the signed percentage error is implemented as a scaled error with a fixed denominator to avoid problems with label values around 0smartCutoffRatio
: if set,scaledErrorCutoff
is determined smartly by taking the average absolute magnitude of the data multiplied with this ratioDescribe alternatives you've considered
RegressionEvaluator
is based on summary statistics (e.g., SSE) that are computed in private fields, so there is no neat way to avoid going over the data once more for the percentage errors.OpWorkflowModel.scoreAndEvaluate
takes a single evaluator, it would mean these metrics wouldn't always be available.Additional context Can be used to express error bands around regression (e.g., "given an error tolerance of 10%, you'd have this many correct, over- and under-predictions"), the need for which is the driving force behind this PR.