wri / gfw_forest_loss_geotrellis

Global Tree Cover Loss Analysis using Geotrellis and SPARK
MIT License
10 stars 8 forks source link

Feature/dec23 layer updates #207

Closed manukala6 closed 10 months ago

manukala6 commented 10 months ago

Pull request checklist

Please check if your PR fulfills the following requirements:

Pull request type

Please check the type of change your PR introduces: - [ ] Bugfix - [x] Feature - [ ] Code style update (formatting, renaming) - [ ] Refactoring (no functional changes, no api changes) - [ ] Build related changes - [ ] Documentation content changes - [ ] Other (please describe): ## What is the current behavior?

Issue Number: GTC-2573

What is the new behavior?

Does this introduce a breaking change?

Other information

danscales commented 10 months ago

The only results-changing issue I see is that the SDPT type pixel meaning doesn't exist, so that raster catalog entry won't work. Also, there's some duplication of the gross emissions (all gases) outputs in annualupdate_minimal, which it would be nice to clean up. Also, I tried running this branch on GMB on my computer and got the error Exception in thread "main" org.apache.spark.sql.AnalysisException: No such struct field mangroves in lossYear, threshold, drivers, primaryForest, wdpa, aze, plantedForests, mangroves1996, mangrovesLatest, tigerLandscapes, landmark, keyBiodiversityAreas, mining, peatlands, oilPalm, idnForestMoratorium, woodFiber, resourceRights, logging, isGain, intactForestLandscapes2000, landCover, tmlDensity. I think this'd be pretty easy to fix but would rather leave it to the engineering team.

In case Maanas doesn't already have it, @dagibbs22 can you provide a command line to run the annualupdate_minimal analysis locally if possible. That way, he can test that he fixed the struct field mangroves error. It would be nice to eventually add a local test for annualupdate_minimal (which I can likely do if you provide a sample command that will run locally).

dagibbs22 commented 10 months ago

@manukala6, in response to Dan's request for the local annualupdate_minimal test commands: Here are the commands I use in the IntelliJ sbt shell to test annualupdate_minimal locally (for GMB and for IDN.14.13):

test:runMain org.globalforestwatch.summarystats.SummaryMain annualupdate_minimal --feature_type gadm --features file:/C:/GIS/Multi_project/Geotrellis_input_GADM/gadm36_adm2_1_1.csv --tcl --output file:/C:/GIS/Carbon_model/zonal_stats/ --iso GMB
test:runMain org.globalforestwatch.summarystats.SummaryMain annualupdate_minimal --feature_type gadm --features file:/C:/GIS/Multi_project/Geotrellis_input_GADM/gadm36_adm2_1_1.csv --tcl --output file:/C:/GIS/Carbon_model/zonal_stats/ 
--admin2 IDN.14.13_1

@danscales , in case it's helpful, here are the equivalent local tests for the carbonflux package:

test:runMain org.globalforestwatch.summarystats.SummaryMain carbonflux --feature_type gadm --features file:/C:/GIS/Multi_project/Geotrellis_input_GADM/gadm36_adm2_1_1.csv --tcl --output file:/C:/GIS/Carbon_model/zonal_stats/ --iso GMB
test:runMain org.globalforestwatch.summarystats.SummaryMain carbonflux --feature_type gadm --features file:/C:/GIS/Multi_project/Geotrellis_input_GADM/gadm36_adm2_1_1.csv --tcl --output file:/C:/GIS/Carbon_model/zonal_stats/ -admin2 IDN.14.13_1
jterry64 commented 10 months ago

Looks good, but it'd be good to check with the frontend on what I mentioned above before changing that field name.

codecov-commenter commented 10 months ago

Codecov Report

Attention: 139 lines in your changes are missing coverage. Please review.

Comparison is base (054a820) 14.69% compared to head (d7aad80) 18.15%. Report is 10 commits behind head on master.

Files Patch % Lines
.../org/globalforestwatch/layers/PlantedForests.scala 0.00% 10 Missing :warning:
...s/annualupdate_minimal/AnnualUpdateMinimalDF.scala 0.00% 8 Missing :warning:
...orestwatch/layers/ForestFluxModelAgeCategory.scala 0.00% 7 Missing :warning:
...atch/layers/ForestFluxModelPlantedForestType.scala 0.00% 7 Missing :warning:
...tats/gfwpro_dashboard/GfwProDashboardCommand.scala 0.00% 7 Missing :warning:
...stwatch/summarystats/firealerts/FireAlertsDF.scala 0.00% 6 Missing :warning:
...stwatch/summarystats/gladalerts/GladAlertsDF.scala 0.00% 6 Missing :warning:
...rystats/integrated_alerts/IntegratedAlertsDF.scala 0.00% 6 Missing :warning:
...bon_sensitivity/CarbonSensitivityGridSources.scala 0.00% 5 Missing :warning:
...ummarystats/carbonflux/CarbonFluxGridSources.scala 0.00% 5 Missing :warning:
... and 47 more

:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #207 +/- ## ========================================== + Coverage 14.69% 18.15% +3.45% ========================================== Files 291 291 Lines 6484 6473 -11 ========================================== + Hits 953 1175 +222 + Misses 5531 5298 -233 ``` | [Flag](https://app.codecov.io/gh/wri/gfw_forest_loss_geotrellis/pull/207/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=wri) | Coverage Δ | | |---|---|---| | [unittests](https://app.codecov.io/gh/wri/gfw_forest_loss_geotrellis/pull/207/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=wri) | `18.15% <6.08%> (+3.45%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=wri#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

danscales commented 10 months ago

FYI, I ran locally the annualupdate_minimal command that David gave above with your latest changes, and it ran through to completion fine. I'm assuming the results were fine, but I don't know any particular way to verify the results.

manukala6 commented 10 months ago

@dagibbs22 The output is tab-delimited. Excel doesn't recognize this automatically, you'll need to modify the file type from .csv to .tsv before opening. Though, I am only seeing true/false values in the is__birdlife_alliance_for_zero_extinction_sites column from the two output files provided.

@danscales We usually perform QC against the dashboard tables on Flagship staging, but I think it should be pre-handoff in future updates.

dagibbs22 commented 10 months ago

@danscales Thanks for running the command locally. It also finished for me in the two test areas (GMB and IDN.14.13). I checked that the right carbon rasters were being read and that looked good.

Regarding determining if the outputs are correct, I like to run annualupdate_minimal locally to check that the emissions, removals, and net flux test areas for GMB and IDN14.13 match between annualupdate_minimal, carbonflux, and statistics I calculate in ArcMap (the gold standard, so to speak). It's not a thorough or full QC process and it only looks at carbon statistics, not all stats. Most of the carbon stats matched but I did notice that the AGB stock in the download table didn't match the AGB stock in the summary table by a surprisingly large margin.

My comparison spreadsheet is https://onewri-my.sharepoint.com/:x:/g/personal/david_gibbs_wri_org/EeLZGJBxr7ZElgxDJ3oCPwABXAd6rKSvknANDlF-0QmIdA?e=bwUsYd. I've filled in rows 14-22 with annualupdate_minimal results in the "GMB comparison" and "IDN14.13 comparison" tabs. The part that's not matching is E20:E22 in "IDN14.13 comparison". They should be very close to E14:E16. Any idea why they're so different? Other outputs I checked are much closer. This difference occurred in the Geotrellis run this summer, so it's not new. I don't think it's a large enough issue that it's worth delaying the annualupdate_minimal run but I wanted to flag it.