pedrocamargo / rasterstats

Other
3 stars 0 forks source link

Add N to reported stats #2

Open aloboa opened 6 years ago

aloboa commented 6 years ago

Could you add the nb of pixels to the reported stats? Also, if possible, as you are reporting the median, could you add the mad (median absolute deviation), https://en.wikipedia.org/wiki/Median_absolute_deviation which is the robust estimator of the standard deviation (median,mad are the robust versions of mean, std. dev).

pedrocamargo commented 6 years ago

Will look into adding these statistics. Update to probably come for QGIS 3 only.

pedrocamargo commented 6 years ago

@aloboa The features have been included, and I have uploaded the new version to the QGIS repository. Just wait for them to approve now. Version for QGIS 3 will come at some point

aloboa commented 6 years ago

Succesfully tested on 2.18 Thanks! Agus

On Fri, Apr 20, 2018, 01:51 Pedro Camargo notifications@github.com wrote:

@aloboa https://github.com/aloboa The features have been included, and I have uploaded the new version to the QGIS repository. Just wait for them to approve now. Version for QGIS 3 will come at some point

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pedrocamargo/rasterstats/issues/2#issuecomment-382916423, or mute the thread https://github.com/notifications/unsubscribe-auth/AFyuxkMqSwrU2pSPS_cQSWm3KzbkSMYFks5tqSL9gaJpZM4TXzfd .

aloboa commented 6 years ago

I think that results are wrong. I suspected because I was getting median and mad values that were always integer, which is unlikely even if dealing with integer input data. Then I made an small data test: (see R code for generating the test raster at the end) test raster https://www.dropbox.com/s/nilutodduzd0gzn/test.tif?dl=0 test polygons (drawn in qgis) https://www.dropbox.com/s/lpgswivazh22xsf/test.zip?dl=0 rasterstats output csv in https://www.dropbox.com/s/kskupclvvqjv6q2/test.csv?dl=0

Results do not correspond to what I get in R, but you can actually check that even the nb of pixels is wrong, you can count them directly on the image: rasterStats indicates 10, 18 and 14 (for each polygon) while you can count 8, 12 and 12. Also, "average" and "mean" are redundant.

You can check R method and results in this pdf https://www.dropbox.com/s/jywn87kp5h3pk59/testRasterStats_log.pdf?dl=0 Agus

pedrocamargo commented 6 years ago

Hey Agus,

Mean and average were indeed redundant. On the count, however, things are right. If you zoom in enough, you will see that the the polygons touch the number of pixels it says they are touching, and that is how the statistics are computed. I could look into computing the statistics in a way to consider, for each polygon, only pixels that have centroids falling inside such polygon, but that was not the approach taken until this point.

I don't have a view either way, as I built this plugin to support my brother's work and my own work with rasters considers polygons that are much bigger (as in 2 orders of magnitude) than the raster pixels, which dwarves these differences you are pointing to. What is your view on this? Any literature I should look into?

On the MAD computation I have actually made a mistake by not converting the numbers to floats, so there will be a next version with that fix (which also eliminates avg/mean redundancy).

aloboa commented 6 years ago

I see, this is why your plugin is so fast. Please note in the documentation that you consider any intersected pixel included, even if relative area is very small. The correct way would be to weight the value of each pixel proportionally to the intersected area. I've just noted that this plugin considers one single band only. Note there is a Zonal statistics plugin which lets you select the band to process. Unfortunately, only one band can be selected at a time.I thought yours would process all bands.

On Tue, Apr 24, 2018 at 12:55 PM, Pedro Camargo notifications@github.com wrote:

Hey Agus,

Mean and average were indeed redundant. On the count, however, things are right. If you zoom in enough, you will see that the the polygons touch the number of pixels it says they are touching, and that is how the statistics are computed. I could look into computing the statistics in a way to consider, for each polygon, only pixels that have centroids falling inside such polygon, but that was not the approach taken until this point.

I don't have a view either way, as I built this plugin to support my brother's work and my own work with rasters considers polygons that are much bigger (as in 2 orders of magnitude) than the raster pixels, which dwarves these differences you are pointing to. What is your view on this? Any literature I should look into?

On the MAD computation I have actually made a mistake by not converting the numbers to floats, so there will be a next version with that fix (which also eliminates avg/mean redundancy).

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pedrocamargo/rasterstats/issues/2#issuecomment-383889670, or mute the thread https://github.com/notifications/unsubscribe-auth/AFyuxgiDh9xZrYT05nnvtWwCUBbS0I_Eks5trwSjgaJpZM4TXzfd .

-- Agustin Lobo aloboaleu@gmail.com

pedrocamargo commented 6 years ago

Hi Augustin,

I'll make the appropriate notes on the documentation.

I'll also include a band selection feature in the next version.

I might look into the proportional computation of pixels by area, but that will depend on GDAL support.

Cheers!

Cheers, Pedro

On Tue, 24 Apr. 2018, 21:43 aloboa, notifications@github.com wrote:

I see, this is why your plugin is so fast. Please note in the documentation that you consider any intersected pixel included, even if relative area is very small. The correct way would be to weight the value of each pixel proportionally to the intersected area. I've just noted that this plugin considers one single band only. Note there is a Zonal statistics plugin which lets you select the band to process. Unfortunately, only one band can be selected at a time.I thought yours would process all bands.

On Tue, Apr 24, 2018 at 12:55 PM, Pedro Camargo notifications@github.com wrote:

Hey Agus,

Mean and average were indeed redundant. On the count, however, things are right. If you zoom in enough, you will see that the the polygons touch the number of pixels it says they are touching, and that is how the statistics are computed. I could look into computing the statistics in a way to consider, for each polygon, only pixels that have centroids falling inside such polygon, but that was not the approach taken until this point.

I don't have a view either way, as I built this plugin to support my brother's work and my own work with rasters considers polygons that are much bigger (as in 2 orders of magnitude) than the raster pixels, which dwarves these differences you are pointing to. What is your view on this? Any literature I should look into?

On the MAD computation I have actually made a mistake by not converting the numbers to floats, so there will be a next version with that fix (which also eliminates avg/mean redundancy).

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/pedrocamargo/rasterstats/issues/2#issuecomment-383889670 , or mute the thread < https://github.com/notifications/unsubscribe-auth/AFyuxgiDh9xZrYT05nnvtWwCUBbS0I_Eks5trwSjgaJpZM4TXzfd

.

-- Agustin Lobo aloboaleu@gmail.com

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/pedrocamargo/rasterstats/issues/2#issuecomment-383901029, or mute the thread https://github.com/notifications/unsubscribe-auth/AHCTAsDdBX-KW84t3Y1PZye1IcOhxEWZks5trw__gaJpZM4TXzfd .