cardat / air-health-bushfire-smoke-netcdf

Software to share australian bushfire smoke data funded by CAR and ARDC. Supported by CurtinIC and ASDAF
MIT License
1 stars 1 forks source link

user tool to extract point or polygon #20

Open ivanhanigan opened 1 year ago

ivanhanigan commented 1 year ago

This is a key requirement and should allow python and/or R users to extract by layer and time period (days)

Polygons

The code:

Points

Here is a snippet that could be useful:

  # NOTE there can be a number of files at once using terra and it will understand how the time dimension is split across years
  fs <- dir("~/cloudstor/Shared/Bushfire_specific_PM25_Aus_2001_2020_v1_3/data_derived", full.names = T)
  r1 <- terra::rast(fs, subds="pm25_pred")
  r2 <- terra::rast(fs, subds="trimmed_smoke_2SD_v1_3")
  xy <- cbind(1545315, -3954140) # this is canberra
  pm25_pred <- extract(r1, as.matrix(xy))
ivanhanigan commented 1 year ago

@truth-quark This is the best polygons to test extraction: https://github.com/swish-climate-impact-assessment/biomass_smoke_events_db/tree/master/static/data_provided = study_slas_01.shp

truth-quark commented 1 year ago

NB: swish-climate-impact-assessment/biomass_smoke_events_db/blob/master/databases/storage.sqlite has verified bushfire events for testing against.

truth-quark commented 1 year ago

Idea: convert WGS84 coord(s) into Albers coords for extraction (with hyperslabbing?). Can NetCDF tools correctly extract or hyperslab if given Albers coords? Do the CDO tools have anything with the required functionality?

ivanhanigan commented 1 year ago

R package terra is very good for this. See these https://github.com/cardat/ResPrj_bushfire_pm25_v1_3_biomass_smoke_events_db_validation/blob/main/R/do_dat_bushfire_smoke_study_locations.R

which can be easily extracted with this https://github.com/cardat/ResPrj_bushfire_pm25_v1_3_biomass_smoke_events_db_validation/blob/main/R/do_dat_extract_bushfire_pred_at_db_locns.R

truth-quark commented 1 year ago

Note for reference from https://gdal.org/drivers/raster/netcdf.html: "This driver is intended only for importing remote sensing and geospatial datasets in form of raster images. If you want explore all data contained in NetCDF file you should use another tools." For a python extractor, GDAL can only be part of a solution.

truth-quark commented 1 year ago

Reference note for nctoolkit: https://nctoolkit.readthedocs.io/en/latest/supported.html "Most operations in nctoolkit rely on Climate Data Operators (CDO) to perform the heavy lifting. CDO requires that files have at most 4 dimensions, which should be longitude and latitude, and time and depth/height.". Based on this, I'm excluding nctoolkit on these grounds:

nctoolkit has a runtime error during import if the CDO version is < 2.0.5. Users either need to use a non LTS Ubuntu (e.g. 23.04) or compile CDO locally. Both options have drawbacks.

CDO 2.1.1 in Ubuntu 23.04: https://launchpad.net/ubuntu/+source/cdo