cardat / air-health-bushfire-smoke-netcdf

Software to share australian bushfire smoke data funded by CAR and ARDC. Supported by CurtinIC and ASDAF
MIT License
1 stars 0 forks source link

user tool to extract point or polygon #20

Open ivanhanigan opened 10 months ago

ivanhanigan commented 10 months ago

This is a key requirement and should allow python and/or R users to extract by layer and time period (days)

Polygons

The code:

Points

Here is a snippet that could be useful:

  # NOTE there can be a number of files at once using terra and it will understand how the time dimension is split across years
  fs <- dir("~/cloudstor/Shared/Bushfire_specific_PM25_Aus_2001_2020_v1_3/data_derived", full.names = T)
  r1 <- terra::rast(fs, subds="pm25_pred")
  r2 <- terra::rast(fs, subds="trimmed_smoke_2SD_v1_3")
  xy <- cbind(1545315, -3954140) # this is canberra
  pm25_pred <- extract(r1, as.matrix(xy))
ivanhanigan commented 10 months ago

@truth-quark This is the best polygons to test extraction: https://github.com/swish-climate-impact-assessment/biomass_smoke_events_db/tree/master/static/data_provided = study_slas_01.shp

truth-quark commented 10 months ago

NB: swish-climate-impact-assessment/biomass_smoke_events_db/blob/master/databases/storage.sqlite has verified bushfire events for testing against.

truth-quark commented 10 months ago

Idea: convert WGS84 coord(s) into Albers coords for extraction (with hyperslabbing?). Can NetCDF tools correctly extract or hyperslab if given Albers coords? Do the CDO tools have anything with the required functionality?

ivanhanigan commented 9 months ago

R package terra is very good for this. See these https://github.com/cardat/ResPrj_bushfire_pm25_v1_3_biomass_smoke_events_db_validation/blob/main/R/do_dat_bushfire_smoke_study_locations.R

which can be easily extracted with this https://github.com/cardat/ResPrj_bushfire_pm25_v1_3_biomass_smoke_events_db_validation/blob/main/R/do_dat_extract_bushfire_pred_at_db_locns.R

truth-quark commented 9 months ago

Note for reference from https://gdal.org/drivers/raster/netcdf.html: "This driver is intended only for importing remote sensing and geospatial datasets in form of raster images. If you want explore all data contained in NetCDF file you should use another tools." For a python extractor, GDAL can only be part of a solution.

truth-quark commented 9 months ago

Reference note for nctoolkit: https://nctoolkit.readthedocs.io/en/latest/supported.html "Most operations in nctoolkit rely on Climate Data Operators (CDO) to perform the heavy lifting. CDO requires that files have at most 4 dimensions, which should be longitude and latitude, and time and depth/height.". Based on this, I'm excluding nctoolkit on these grounds:

nctoolkit has a runtime error during import if the CDO version is < 2.0.5. Users either need to use a non LTS Ubuntu (e.g. 23.04) or compile CDO locally. Both options have drawbacks.

CDO 2.1.1 in Ubuntu 23.04: https://launchpad.net/ubuntu/+source/cdo