wri / gfw_forest_loss_geotrellis

Global Tree Cover Loss Analysis using Geotrellis and SPARK
MIT License
10 stars 8 forks source link

GTC-2826 Colombia-specific analysis in forest_change_diagnostic #238

Closed danscales closed 4 months ago

danscales commented 4 months ago

GTC-2826 Colombia-specific analysis in forest_change_diagnostic

We now do country-specific analyses for Brazil (BRA), Argentina (ARG), and Colombia (COL). Updated the gfwpro_forest_change_regions dataset to new v20240529 version that includes a bit indicating a pixel location is in COL, so we can do the necessary Colombia-specific analysis.

Added a Colombia country-specific classification from the new Colombia col_frontera_agricola/v2024 dataset that I added. For Argentina, the analogous country-specific classification is OTBN.

For the GFWProCoverage layer, I changed the code to just return the actual sets of bits, rather than a map corresponding to all the 1 bits. It seems very inefficient to create a new Map object for every single pixel being scanned, whereas a bitmap with access functions is very efficient.

New columns:

Existing columns that are extended to COL

To support classified_region_area, I had to add a new two-level categorization type, ForestChangeDiagnosticDataDoubleTwoCategory (i.e. there are two levels of categorization by strings, and the final value is a Double). Added extra include checks in the fill operations of several of these data types, so we don't have an empty country entry when there is no forest loss, protected area, landmark, etc for a particular country.

Added a new Colombia-specific test. (Removed "*.tsv" from .gitignore - makes it confusing when you want to add new tsv input files.)

danscales commented 4 months ago

No comments really, looks good to me! Although curious if you've noticed any performance implication as we've added all these regional datasets.

Because of the laziness that I put in for FCD and also the fact that the datasets are regional, we only load tiles for the regional datasets when analyzing COL, ARG, or BRA. For example, we were already loading PRODES deforestation info only for Brazil. So, I don't expect any performance implications for locations that are not in COL, ARG, or BRA.

Thanks for the review!