salesforce / TransmogrifAI

TransmogrifAI (pronounced trăns-mŏgˈrə-fī) is an AutoML library for building modular, reusable, strongly typed machine learning workflows on Apache Spark with minimal hand-tuning
https://transmogrif.ai
BSD 3-Clause "New" or "Revised" License
2.24k stars 392 forks source link

put feature feature corr behind flag #479

Closed leahmcguire closed 4 years ago

leahmcguire commented 4 years ago

Related issues Feature feature correlations in sanity metadata can make the model too large to load

Describe the proposed solution A clear and concise description of what the changes are.

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Additional context Add any other context about the changes here.

codecov[bot] commented 4 years ago

Codecov Report

Merging #479 into master will increase coverage by 5.60%. The diff coverage is 81.25%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #479      +/-   ##
==========================================
+ Coverage   81.41%   87.02%   +5.60%     
==========================================
  Files         345      345              
  Lines       11667    11673       +6     
  Branches      384      388       +4     
==========================================
+ Hits         9499    10158     +659     
+ Misses       2168     1515     -653     
Impacted Files Coverage Δ
...ala/com/salesforce/op/dsl/RichNumericFeature.scala 100.00% <ø> (+3.63%) :arrow_up:
...rce/op/stages/impl/preparators/SanityChecker.scala 90.57% <75.00%> (-0.68%) :arrow_down:
...tages/impl/preparators/SanityCheckerMetadata.scala 89.86% <100.00%> (+0.13%) :arrow_up:
...s/impl/preparators/DerivedFeatureFilterUtils.scala 93.08% <0.00%> (+0.62%) :arrow_up:
...sforce/op/stages/impl/feature/Transmogrifier.scala 98.05% <0.00%> (+0.83%) :arrow_up:
...com/salesforce/op/features/FeatureSparkTypes.scala 99.14% <0.00%> (+0.85%) :arrow_up:
...esforce/op/stages/impl/feature/TextTokenizer.scala 97.22% <0.00%> (+1.38%) :arrow_up:
...a/com/salesforce/op/readers/JoinedDataReader.scala 94.23% <0.00%> (+1.92%) :arrow_up:
...op/stages/impl/selector/ModelSelectorSummary.scala 91.83% <0.00%> (+2.04%) :arrow_up:
...p/stages/impl/feature/SmartTextMapVectorizer.scala 100.00% <0.00%> (+2.14%) :arrow_up:
... and 49 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 93e1fde...2a8f128. Read the comment docs.