ropensci / rerddap

R client for working with ERDDAP servers
https://docs.ropensci.org/rerddap
Other
40 stars 14 forks source link

Change in ERDDAP breaks rerddap in certain situations #68

Closed rmendels closed 6 years ago

rmendels commented 6 years ago

Hi Scott:

There is a new version of ERDDAP (version 1.8.2), and there is a change in it that breaks rerddap in particular situations. This is for any dataset where latitude runs north to south, not south to north, such as:

http://upwell.pfeg.noaa.gov/erddap/griddap/nesdisVHNSQchlaMonthly.html

It used to be for those datasets, the values of "actual_range" for latitude would be reversed also, but now they give the actual minimum and maximum values, regardless of the order.
The following will produce the problem directly in rerddap:

xpos <- c(-160, -150)
ypos <- c(-21, -16)
tpos <- c("2015-01-15",  "2015-02-15")
dataInfo <- rerddap::info('nesdisVHNSQchlaMonthly')
parameter <- 'chlor_a'
zpos <- c(0, 0)
extract <- griddap(dataInfo, time = tpos,  latitude = ypos,  longitude = xpos,  altitude = zpos,  fields = "chlor_a" )

What breaks is in the grid.R file, the function fix_dims() at these lines:

    val <- .info$alldata[[nm]][ .info$alldata[[nm]]$attribute_name == "actual_range", "value"]
    val2 <- as.numeric(strtrim(strsplit(val, ",")[[1]]))
    if (length(tmp) != 0) {
      if (which.min(val2) != which.min(tmp)) {
        dimargs[[i]] <- rev(dimargs[[i]])
      }
    }

as this will not flip dimensions, and the request fails. What can tell you if you need to flip is latitudeSpacing. The following somewhat kludgy code can take what is already in the return from rerddap::info() and make see if they need to be flipped. Suppose I have done:

dataInfo <- rerddap::info("nesdisVHNSQchlaMonthly")

then:

  spacing_string <- unlist(strsplit(dataInfo$alldata$latitude$value[1], ","))
  spacing = unlist(strsplit(spacing_string[3], "="))
  spacing <- as.numeric(spacing[2])
  if (spacing < 0) {
    latSouth <- FALSE
##  this is where you would flip
  }

I imagine there is a more elegant way, but just to give you an idea. If I then for my code in rerddapXtracto:

    latVal <- dataInfo$alldata$latitude[dataInfo$alldata$latitude$attribute_name == "actual_range", "value"]
    latVal2 <- as.numeric(strtrim1(strsplit(latVal, ",")[[1]]))
    tempLat <- paste0(latVal2[2], ',', latVal2[1])
    dataInfo$alldata$latitude[dataInfo$alldata$latitude$attribute_name == "actual_range", "value"] <- tempLat

so I make the change in dataInfo and pass that to rerddap::griddap() everything works. Again I do not claim the above is very elegant, but it shows you where in the code the problem occurs, and where it can be fixed. As I said, this is a change in what ERDDAP returns.

I have checked an older version of ERDDAP, and latitudeSpacing appears to be consistent across versions, so it is a more robust test.

BTW - the reason for the change is it is consistent with CF 1.7. A user reported problems with rerddapXtracto, and I isolated it to this.

-Roy

sckott commented 6 years ago

thanks for the issue @rmendels

Will get this fixed asap

rmendels commented 6 years ago

Hi Scott:

Any progress on this? We have gotten several inquiries about this in the last week or so.

Thanks

sckott commented 6 years ago

Sorry, not yet. Very busy. I will add it to the to do list for tomorrow. Will try to get to it

sckott commented 6 years ago

@rmendels looking at this.

Looks like queries aren't working, e.,g., http://upwell.pfeg.noaa.gov/erddap/griddap/nesdisVHNSQchlaMonthly.htmlTable?chlor_a[(2018-05-01T12:00:00Z):1:(2018-05-01T12:00:00Z)][(0.0):1:(0.0)][(89.75625):1:(-89.75626)][(-179.9812):1:(179.9813)]

rmendels commented 6 years ago

Hi Scott:

for better or worse, we don't stop people from making queries that are likely not going to be completed due to the size of the request. In most cases, ERDDAP is still processing the request, it is the connection that times out. That is a 3km dataset, so your request is quite large (it is a global request). It worked for me if I requested a netcdf file (I am out home). The netcdf file is almost 200MB, so I imagine that the htmlTable request was quite large ( I find binary to ascii increases the return about 8-10 fold). The request more than likely timed out.

sckott commented 6 years ago

thanks, good to know.

sckott commented 6 years ago

should be fixed now