noaa-onms / cinms

Channel Islands National Marine Sanctuary
https://noaa-onms.github.io/cinms
MIT License
3 stars 1 forks source link

do we want a different measure of variation? #52

Closed superjai closed 4 years ago

superjai commented 4 years ago

I noticed something that we might want to correct in the interactive graphs. Take a look at the chlorophyll graph below: Screen Shot 2020-10-13 at 9 42 25 PM The measure of variation used here is the standard deviation. Minor problem! The standard deviation is a symmetric measure which is why the curve goes below zero sometimes (and which obviously makes no physical sense). Do we want to use an asymmetric measure of variation instead, so that we don't cross zero?

bbest commented 4 years ago

Good point @superjai! Let's use quantile() and update the function to allow any of these stats. The trick with quantile is another argument. Am thinking that if statistic is the argument "quantile05" and "quantile95" could be parsed to get the function and argument.

superjai commented 4 years ago

That seems like a good way to go @bbest . Do we want to go with the 5th and 95th quantiles for the lower and upper bounds?

superjai commented 4 years ago

@bbest I am having trouble getting the quantile function to play nicely with raster::extract. Raster::extract is used by ply2erddap to pull required stats from the raster data, right here. If I use the default quantiles for the function quantile, everything works fine. However, if I attempt to use custom quantiles, then raster::extract breaks. I am of the opinion that using the default quantiles should be just fine for this purpose. With the defaults, we'll get the 25th and 75th quantiles, which is pretty close in spread to our previous graphing unit (1 standard deviation).

Anyway, on with the problem, in case you want to go with custom quantiles. Let's first show the version that actually works with the default quantiles.

> nms4r::ply2erddap("cinms", "jplMURSST41mday", "sst", 2020, 7, "quantile")
     quantile
[1,] 15.79532
[2,] 16.67290
[3,] 17.93039
[4,] 19.06013
[5,] 20.24680

OK, now let's try custom quantiles. Let's start by defining those quantiles:

custom_quants <- function(input_data){
  return(quantile(input_data,probs=c(0.05,0.5,0.95)))
}

When running this through ply2erddap, here's the result:

> nms4r::ply2erddap("cinms", "jplMURSST41mday", "sst", 2020, 7, "custom_quants")
Error in fun(res[[i]], na.rm = na.rm) : unused argument (na.rm = na.rm)

This continues even if na.rm= TRUE is specified within custom_quants.

bbest commented 4 years ago

Hi @superjai,

I would simply extract all pixel values in the polygon from the raster into a vector first, then apply the quantile function on that.

superjai commented 4 years ago

Thanks @bbest. That did it.

bbest commented 4 years ago

Great, feel free to check in the working code and included "closes #52" in the git commit message to close this issue :)

superjai commented 4 years ago

This issue is now wrapped up. I went with the mean for the line being plotted (as opposed to the median), because the median is a measure that is less familiar to people. The mean and median numbers were very similar for both sea surface temperature and chlorophyll data, which is why I thought it didn't matter much.

bbest commented 4 years ago

Now looks like:

image