Closed echeipesh closed 3 years ago
Merging #113 (f681de5) into develop (2f396c4) will increase coverage by
0.10%
. The diff coverage is0.00%
.:exclamation: Current head f681de5 differs from pull request most recent head 73a763d. Consider uploading reports for the commit 73a763d to get more accurate results
@@ Coverage Diff @@
## develop #113 +/- ##
==========================================
+ Coverage 3.51% 3.62% +0.10%
==========================================
Files 258 256 -2
Lines 5257 5110 -147
==========================================
Hits 185 185
+ Misses 5072 4925 -147
Flag | Coverage Δ | |
---|---|---|
unittests | 3.62% <0.00%> (+0.10%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
Impacted Files | Coverage Δ | |
---|---|---|
...scala/org/globalforestwatch/features/Feature.scala | 0.00% <0.00%> (ø) |
|
...ala/org/globalforestwatch/features/FeatureDF.scala | 0.00% <ø> (ø) |
|
...org/globalforestwatch/features/FeatureFilter.scala | 0.00% <0.00%> (ø) |
|
...ala/org/globalforestwatch/features/FeatureId.scala | 0.00% <ø> (ø) |
|
...la/org/globalforestwatch/features/FeatureRDD.scala | 0.00% <0.00%> (ø) |
|
.../org/globalforestwatch/features/FireAlertRDD.scala | 0.00% <0.00%> (ø) |
|
...a/org/globalforestwatch/features/GadmFeature.scala | 0.00% <0.00%> (ø) |
|
...org/globalforestwatch/features/GfwProFeature.scala | 0.00% <0.00%> (ø) |
|
...alforestwatch/features/PolygonIntersectionDF.scala | 0.00% <0.00%> (ø) |
|
...org/globalforestwatch/features/SimpleFeature.scala | 0.00% <0.00%> (ø) |
|
... and 18 more |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 2f396c4...73a763d. Read the comment docs.
Pull request type
Please check the type of change your PR introduces: - [ ] Bugfix - [ ] Feature - [ ] Code style update (formatting, renaming) - [x] Refactoring (no functional changes, no api changes) - [ ] Build related changes - [ ] Documentation content changes - [ ] Other (please describe): ## What is the current behavior? Using ForestChangeDiagnosticCommand as an exmaple: ```scala val forestChangeDiagnosticCommand: Opts[Unit] = Opts.subcommand( name = "forest_change_diagnostic", help = "Compute summary statistics for GFW Pro Forest Change Diagnostic." ) { ( defaultOptions, intermediateListSourceOpt, fireAlertOptions, defaultFilterOptions, featureFilterOptions, ).mapN { (default, intermediateListSource, fireAlert, defaultFilter, featureFilter) => val kwargs = Map( "featureUris" -> default._2, "outputUrl" -> default._3, "splitFeatures" -> true, // force to split features "intermediateListSource" -> intermediateListSource, "fireAlertType" -> fireAlert._1, "fireAlertSource" -> fireAlert._2, "idStart" -> featureFilter._1, "idEnd" -> featureFilter._2, "limit" -> defaultFilter._1, "tcl" -> defaultFilter._2, "glad" -> defaultFilter._3 ) runAnalysis("forest_change_diagnostic", default._1, default._2, kwargs) } ``` ## What is the new behavior? ```scala val forestChangeDiagnosticCommand: Opts[Unit] = Opts.subcommand( name = ForestChangeDiagnosticAnalysis.name, help = "Compute summary statistics for GFW Pro Forest Change Diagnostic." ) { ( defaultOptions, intermediateListSourceOpt, fireAlertOptions, featureFilterOptions ).mapN { (default, intermediateListSource, fireAlert, filterOptions) => val kwargs = Map( "outputUrl" -> default.outputUrl, "noOutputPathSuffix" -> default.noOutputPathSuffix ) val featureFilter = FeatureFilter.fromOptions(default.featureType, filterOptions) runAnalysis { spark => val featureRDD = FeatureRDD(default.featureUris, default.featureType, featureFilter, default.splitFeatures, spark) val fireAlertRDD = FireAlertRDD(spark, fireAlert.alertType, fireAlert.alertSource, FeatureFilter.empty) ForestChangeDiagnosticAnalysis(featureRDD, default.featureType, intermediateListSource, fireAlertRDD, spark, kwargs) } } ``` This change involves several refactors: - All command line options are parsed into case classes instead of tuples, minimizing mistakes from using `._1` instead of `._2` - All filter related options are aggregated together into `featureFilterOptions` which are used directly instead of `kwargs` - FeatureRDD and FireRDD are brought to top level to minimize use of `kwargs` - `runAnalysis` invokes the analysis directly without going through the `SummaryAnalysisFactory` Discussion on intention of the refactor: ### Minimize use of `kwargs` for filters This has two goals, increase readability, navigability of the code. Ideally one able to trace usage of each parameter to its source using code navigation "Go To Definition". This is easier to read but also allows the compiler to enforce type checking and support future refactors. The main usage of kwargs here was for selecting filters on feature inputs. Potentially all filters are available to the user, but only some of them are applicable for specific feature type. This logic is no captured in the interaction of `FeatureFilter.fromOptions` and `AllFilterOptions` (which aggregates all possible command line argument input). Additional upshot is that this logic is now applied uniformly to all commands without having to add all supported filters to kwargs. The instantiated filters are passed through as instance of `FeatureFilter` class that matched the `featureType` value. ### Remove Analysis Factory The analysis factory has two major drawbacks in this code base: 1. Additional and unnecessary indirection, each command has to specify the analysis name anyway. 2. Unnecessary interface restriction. It forces uniformity for the Analysis class that is not needed and requires use of `kwargs` to get around The preferred situation is that each command can use the parsed arguments to instantiate the corresponding analysis as needed. ## Does this introduce a breaking change? - [ ] Yes - [x] No ## Other notes There are additional clean-ups that are implied here, but this PR should allow future refactors to happen per analysis command.