isciences / exactextract

Fast and accurate raster zonal statistics
Apache License 2.0
246 stars 32 forks source link

enhancement, output multiple columns of the polygon dataset #22

Closed zhanlilz closed 9 months ago

zhanlilz commented 3 years ago

Hi, thanks for the great package! I'm wondering if an enhancement of this command line program is possible. In the R package, we can output multiple columns of the input polygon dataset. However, for the command line program, we can only specify one -f column_name to be exported to the output CSV file. Is it possible to have multiple columns of the polygon dataset written in the output? Like the argument append_cols in the R function exact_extract?

I'm willing to try to help and contribute. But I'd like to hear about the feasibility of adding this function.

Cheers!

dbaston commented 3 years ago

I think the assumption of "each feature has a single ID field" is pretty baked in. Of course that can be changed, but I'm not sure it's worth the complexity. Clearly you're familiar with the R package; can I ask your motivation for using the command line instead?

zhanlilz commented 3 years ago

Sometimes I'd like to use the command line in a shell script to automate processing steps. Now that you don't think the benefit is worth the complexity of changing the C++ code. I think the best way for such automation in shell scripts has two options,

  1. Choose a single ID field to be written by the exactextract command and then use some other command to join the output CSV with the original attribute table of the input polygon dataset. This may not be so straightforward sometimes since some common tools like awk, join are not so robust with dealing with CSV files (e.g, values in one columns are strings with comma itself...), as I have experienced myself. But it can be done by awk or join.
  2. Make a command line by wrapping the R function exact_extract in a R script to achieve such a function. Then call this R script in shell scripts. This may be an easier way to go.

These two options are prob. sufficient and better esp. if you think the complexity outweighs the benefits. I think it's okay to close this issue now.

dbaston commented 9 months ago

Implemented in c7d03c92e59bd83750663218cd4b6b13305080ab