Closed manukala6 closed 10 months ago
The only results-changing issue I see is that the SDPT
type
pixel meaning doesn't exist, so that raster catalog entry won't work. Also, there's some duplication of the gross emissions (all gases) outputs inannualupdate_minimal
, which it would be nice to clean up. Also, I tried running this branch on GMB on my computer and got the errorException in thread "main" org.apache.spark.sql.AnalysisException: No such struct field mangroves in lossYear, threshold, drivers, primaryForest, wdpa, aze, plantedForests, mangroves1996, mangrovesLatest, tigerLandscapes, landmark, keyBiodiversityAreas, mining, peatlands, oilPalm, idnForestMoratorium, woodFiber, resourceRights, logging, isGain, intactForestLandscapes2000, landCover, tmlDensity
. I think this'd be pretty easy to fix but would rather leave it to the engineering team.
In case Maanas doesn't already have it, @dagibbs22 can you provide a command line to run the annualupdate_minimal
analysis locally if possible. That way, he can test that he fixed the struct field mangroves
error. It would be nice to eventually add a local test for annualupdate_minimal
(which I can likely do if you provide a sample command that will run locally).
@manukala6, in response to Dan's request for the local annualupdate_minimal
test commands: Here are the commands I use in the IntelliJ sbt shell to test annualupdate_minimal locally (for GMB and for IDN.14.13):
test:runMain org.globalforestwatch.summarystats.SummaryMain annualupdate_minimal --feature_type gadm --features file:/C:/GIS/Multi_project/Geotrellis_input_GADM/gadm36_adm2_1_1.csv --tcl --output file:/C:/GIS/Carbon_model/zonal_stats/ --iso GMB
test:runMain org.globalforestwatch.summarystats.SummaryMain annualupdate_minimal --feature_type gadm --features file:/C:/GIS/Multi_project/Geotrellis_input_GADM/gadm36_adm2_1_1.csv --tcl --output file:/C:/GIS/Carbon_model/zonal_stats/
--admin2 IDN.14.13_1
@danscales , in case it's helpful, here are the equivalent local tests for the carbonflux
package:
test:runMain org.globalforestwatch.summarystats.SummaryMain carbonflux --feature_type gadm --features file:/C:/GIS/Multi_project/Geotrellis_input_GADM/gadm36_adm2_1_1.csv --tcl --output file:/C:/GIS/Carbon_model/zonal_stats/ --iso GMB
test:runMain org.globalforestwatch.summarystats.SummaryMain carbonflux --feature_type gadm --features file:/C:/GIS/Multi_project/Geotrellis_input_GADM/gadm36_adm2_1_1.csv --tcl --output file:/C:/GIS/Carbon_model/zonal_stats/ -admin2 IDN.14.13_1
Looks good, but it'd be good to check with the frontend on what I mentioned above before changing that field name.
Attention: 139 lines
in your changes are missing coverage. Please review.
Comparison is base (
054a820
) 14.69% compared to head (d7aad80
) 18.15%. Report is 10 commits behind head on master.
:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
FYI, I ran locally the annualupdate_minimal
command that David gave above with your latest changes, and it ran through to completion fine. I'm assuming the results were fine, but I don't know any particular way to verify the results.
@dagibbs22 The output is tab-delimited. Excel doesn't recognize this automatically, you'll need to modify the file type from .csv
to .tsv
before opening. Though, I am only seeing true/false values in the is__birdlife_alliance_for_zero_extinction_sites
column from the two output files provided.
@danscales We usually perform QC against the dashboard tables on Flagship staging, but I think it should be pre-handoff in future updates.
@danscales Thanks for running the command locally. It also finished for me in the two test areas (GMB and IDN.14.13). I checked that the right carbon rasters were being read and that looked good.
Regarding determining if the outputs are correct, I like to run annualupdate_minimal
locally to check that the emissions, removals, and net flux test areas for GMB and IDN14.13 match between annualupdate_minimal
, carbonflux
, and statistics I calculate in ArcMap (the gold standard, so to speak). It's not a thorough or full QC process and it only looks at carbon statistics, not all stats. Most of the carbon stats matched but I did notice that the AGB stock in the download table didn't match the AGB stock in the summary table by a surprisingly large margin.
My comparison spreadsheet is https://onewri-my.sharepoint.com/:x:/g/personal/david_gibbs_wri_org/EeLZGJBxr7ZElgxDJ3oCPwABXAd6rKSvknANDlF-0QmIdA?e=bwUsYd. I've filled in rows 14-22 with annualupdate_minimal
results in the "GMB comparison" and "IDN14.13 comparison" tabs. The part that's not matching is E20:E22 in "IDN14.13 comparison". They should be very close to E14:E16. Any idea why they're so different? Other outputs I checked are much closer. This difference occurred in the Geotrellis run this summer, so it's not new. I don't think it's a large enough issue that it's worth delaying the annualupdate_minimal
run but I wanted to flag it.
Pull request checklist
Please check if your PR fulfills the following requirements:
Pull request type
Please check the type of change your PR introduces: - [ ] Bugfix - [x] Feature - [ ] Code style update (formatting, renaming) - [ ] Refactoring (no functional changes, no api changes) - [ ] Build related changes - [ ] Documentation content changes - [ ] Other (please describe): ## What is the current behavior?Issue Number: GTC-2573
What is the new behavior?
Does this introduce a breaking change?
Other information