NASA-IMPACT / veda-docs

Documentation for the VEDA Project
https://nasa-impact.github.io/veda-docs
Apache License 2.0
7 stars 7 forks source link

Fix up downsample-zarr notebook #139

Closed jsignell closed 5 months ago

jsignell commented 7 months ago
review-notebook-app[bot] commented 7 months ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

github-actions[bot] commented 7 months ago

PR Preview Action v1.4.7 :---: :rocket: Deployed preview to https://NASA-IMPACT.github.io/veda-docs/pr-preview/pr-139/ on branch gh-pages at 2024-05-01 20:45 UTC

review-notebook-app[bot] commented 7 months ago

View / edit / reply to this conversation on ReviewNB

wildintellect commented on 2024-04-25T23:27:09Z ----------------------------------------------------------------

"for to" probably "to"?

"within the memory limits of the notebook"? Is this really the notebook memory or the Jupyter instance memory? Also can we clarify that Dask is being used to parallel compute with local dask cluster, hence the memory limits of running instance matter.

I find the Downsample and Coarsen terminology a little foreign (seems like other tools might call this something else).

Coarsen aka aggregate?

Downsample aka Sub-select?

Is this still using Datashader? The calls below only use hvplot now (directly)


jsignell commented on 2024-04-26T12:59:47Z ----------------------------------------------------------------

Yeah I didn't touch the language in this PR, but these are good thoughts. I can try to make it clearer. It is still using datashader, it's just internal now (via the rasterize kwarg)

jsignell commented on 2024-05-01T19:12:38Z ----------------------------------------------------------------

I think aggregate implies that the whole dimension will be collapsed. So I left "coarsening"

wildintellect commented on 2024-05-01T20:13:42Z ----------------------------------------------------------------

I think this is just a terminology difference with GIS uers

https://pro.arcgis.com/en/pro-app/3.1/tool-reference/data-management/resample.htm <- aggregate

https://rspatial.github.io/terra/reference/aggregate.html

GDAL doesn't even differentiate and just calls it resampling https://gis.stackexchange.com/a/262318 though users clearly call it downsampling

I see why it gets confusing in GRASS r.resample like GDAL, but if it's time it's t.rast.aggregate

jsignell commented on 2024-05-01T20:41:56Z ----------------------------------------------------------------

I also decided downsample was correct for the time dimension. I think the name is based off the resample method in xarray (and pandas) then there is "upsample" for when you end up with more values than you started with, and "downsample" for when you end up with fewer. To me subselect should more like taking an aoi or a particular month. More like taking a piece of the data with the existing resolution rather than changing the resolution. Obviously you can use select to change the step, but that's not what first comes to my mind.