salesforce / TransmogrifAI

TransmogrifAI (pronounced trăns-mŏgˈrə-fī) is an AutoML library for building modular, reusable, strongly typed machine learning workflows on Apache Spark with minimal hand-tuning
https://transmogrif.ai
BSD 3-Clause "New" or "Revised" License
2.24k stars 393 forks source link

Replace empty XGBoost feature importance with a vector of zeros #487

Closed TuanNguyen27 closed 4 years ago

TuanNguyen27 commented 4 years ago

Related issues A successfully trained XGBoost model could return an empty feature importance vector when the features have zero signal w.r.t the label. This behavior will fail getModelContributions via this line.

require(featureScore.nonEmpty, "Feature score map is empty")

Describe the proposed solution In this case, we will return a feature contribution vector of 0's so that it matches the behavior of other models.

codecov[bot] commented 4 years ago

Codecov Report

Merging #487 into master will decrease coverage by 5.59%. The diff coverage is 33.33%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #487      +/-   ##
==========================================
- Coverage   87.02%   81.43%   -5.60%     
==========================================
  Files         345      345              
  Lines       11683    11687       +4     
  Branches      387      377      -10     
==========================================
- Hits        10167     9517     -650     
- Misses       1516     2170     +654     
Impacted Files Coverage Δ
...c/main/scala/com/salesforce/op/ModelInsights.scala 91.88% <33.33%> (-1.21%) :arrow_down:
...ala/com/salesforce/op/utils/tuples/RichTuple.scala 0.00% <0.00%> (-100.00%) :arrow_down:
...alesforce/op/aggregators/TimeBasedAggregator.scala 0.00% <0.00%> (-100.00%) :arrow_down:
...stages/impl/feature/TimePeriodMapTransformer.scala 0.00% <0.00%> (-100.00%) :arrow_down:
...e/op/stages/impl/insights/RecordInsightsCorr.scala 0.00% <0.00%> (-98.25%) :arrow_down:
utils/src/main/scala/com/salesforce/op/UID.scala 0.00% <0.00%> (-91.67%) :arrow_down:
...op/stages/impl/preparators/MinVarianceFilter.scala 0.00% <0.00%> (-91.31%) :arrow_down:
...es/src/main/scala/com/salesforce/op/OpParams.scala 0.00% <0.00%> (-89.80%) :arrow_down:
...ala/com/salesforce/op/stages/SparkStageParam.scala 0.00% <0.00%> (-77.42%) :arrow_down:
...a/com/salesforce/op/utils/spark/RichMetadata.scala 15.78% <0.00%> (-73.69%) :arrow_down:
... and 47 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 5a2449e...7f58e4f. Read the comment docs.