isciences / exactextractr

R package for fast and accurate raster zonal statistics
https://isciences.gitlab.io/exactextractr/
274 stars 26 forks source link

Feature request: Add polygon ID variable to output #35

Closed jake-wittman closed 3 years ago

jake-wittman commented 4 years ago

Not sure if this is the appropriate place for this, but I'd like to suggest a feature that adds an ID variable to the output dataframe when you're using exact_extract() with a set of polygons over many raster layers. That would really facilitate joins with other data frames.

It's entirely possible I missed this functionality while reading the documentation, so if it already exists please let me know!

dbaston commented 4 years ago

You didn't miss anything. I started adding this in a few months ago, and I can't remember why I abandoned it. Maybe I didn't convince myself that

exact_extract(prec, brazil, c('mean', 'stdev')), include_cols='GID_1')

was that much better than

cbind(GID_2=brazil$GID_2,
      exact_extract(prec, brazil, c('mean', 'stdev')))

But given that we already have include_xy and include_cell, I guess it's not a bad idea.

ernstste commented 4 years ago

I like wittja01's idea. This could even be extended to create an output similar to the raster::extract(df = T) behaviour. A dataframe is returned, rather than a list of data frames. However, having an extra column with the ID would also do the job, as a simple do.call(rbind, dataframe) would create the desired single dataframe.

dbaston commented 3 years ago

I have a test branch that (I think) tackles both of these cases:

devtools::install_git('https://gitlab.com/isciences/exactextractr', ref='append-cols')

Using the datasets in the README:

To include columns from the input in summarized results, use append_cols:

exact_extract(prec, brazil[1:10, ], 'sum', append_cols='GID_2')

To have columns from the input included in a data frame with each pixel value, use include_cols (analogous to include_xy and include_cell)

do.call(rbind, exact_extract(prec, brazil[1:10, ], include_cols='GID_2'))

dbaston commented 3 years ago

Resolved by 96daf16ecbd30a29c6967ee425ad967e32a4d01f