rmendels / rerddapXtracto

xtractomactic using rerddap
Other
14 stars 4 forks source link

rextracto errors since 0.4.8: min_dimargs < min_coord. Since 1.0.0: req_time_index[i] #24

Closed SimonDedman closed 4 years ago

SimonDedman commented 4 years ago

Hi Roy, hope you're well and enjoying the apocalyptic ash storm,

At some point I updated to v0.4.8 and in re-running my same script on (I'm 99% sure) the same format data, my rextracto() call borks with:

Error in if ((min_dimargs < min_coord) | (max_dimargs > max_coord)) { : missing value where TRUE/FALSE needed

I couldn't find that line or indeed those specific objects in the code. I updated to v1.0.0 and the same line now returns:

Error in req_time_index[i] <- which.min(abs(udtTime - temp_time)) : replacement has length zero

Line 164 in the code. I'm trying to debug this now but am currently stuck at L106:

could not find function checkInput

This is inevitably because I've broken the code open & am running it line by line and probably don't have the dependencies installed locally. I've googled that function and can't see it listed in any of rerddapXtracto's Imports/Depends but, full disclosure, I've not been looking long.

Thought I'd open this just in case any of this rings a bell, in case you remember making any changes in the last few commits which might have affected this.

Reprex in case it helps:

tmp.csv

dataset <- 'erdSW2018chla8day'
urlbase <- "http://coastwatch.pfeg.noaa.gov/erddap/"
dataInfo <- rerddap::info(dataset, url = urlbase)
rerddap::cache_delete_all(force = TRUE)
chl_pre <- rxtracto(dataInfo,
                           parameter = 'chlorophyll',
                           xcoord = tmp$lon,
                           ycoord = tmp$lat,
                           tcoord = tmp$Date,
                           progress_bar = TRUE)

Cheers in advance! Simon

rmendels commented 4 years ago

Hi Simon:

Please go to:

https://coastwatch.pfeg.noaa.gov/erddap/griddap/erdSW2018chla8day.html

and look at the last date for that dataset. Then look at the dates in your file after line 1404. The program purposely does not allow out-of-bounds input. This is because the user may unwittingly be using the wrong dataset for their purposes. The user must either consciously truncate their input or choose a more appropriate ERDDAP dataset for their data.

SimonDedman commented 4 years ago

Argh, sorry Roy, the problem remains I just accidentally created a tmp file from the second subset (post) but shared the code for the first subset (pre). tmp file updated with pre-subset data, now here. Cheers.

rmendels commented 4 years ago

@SimonDedman

Sorry can't be much help here (see below). I don't know where you got the version you are using, or what you are doing, but you might try re-installing from CRAN. I was pleased that for almost 1000 points it did appear to be faster, which is the idea of the new version.

chl_pre <- rxtracto(dataInfo,
+                     parameter = 'chlorophyll',
+                     xcoord = tmp$lon,
+                     ycoord = tmp$lat,
+                     tcoord = tmp$Date,
+                     progress_bar = TRUE)
  |======================================================| 100%
> 
> str(chl_pre)
List of 13
 $ mean chlorophyll  : num [1:995] 0.1667 0.177 0.1122 0.0891 0.0912 ...
 $ stdev chlorophyll : num [1:995] NA NA NA NA NA NA NA NA NA NA ...
 $ n                 : int [1:995] 1 1 1 1 1 1 1 1 1 1 ...
 $ satellite date    : chr [1:995] "2000-09-17T00:00:00Z" "2000-09-17T00:00:00Z" "2000-09-17T00:00:00Z" "2000-09-17T00:00:00Z" ...
 $ requested lon min : num [1:995] 9.73 10 10.14 10.24 10.23 ...
 $ requested lon max : num [1:995] 9.73 10 10.14 10.24 10.23 ...
 $ requested lat min : num [1:995] 41.6 41.3 41.1 40.7 40.2 ...
 $ requested lat max : num [1:995] 41.6 41.3 41.1 40.7 40.2 ...
 $ requested z min   : logi [1:995] NA NA NA NA NA NA ...
 $ requested z max   : logi [1:995] NA NA NA NA NA NA ...
 $ requested date    : chr [1:995] "2003-01-04" "2003-01-04" "2003-01-04" "2003-01-04" ...
 $ median chlorophyll: num [1:995] 0.1667 0.177 0.1122 0.0891 0.0912 ...
 $ mad chlorophyll   : num [1:995] 0 0 0 0 0 0 0 0 0 0 ...
 - attr(*, "row.names")= chr [1:995] "1" "2" "3" "4" ...
 - attr(*, "class")= chr [1:2] "list" "rxtractoTrack"
SimonDedman commented 4 years ago

Ok so my actual code calling the function was:

xcoord = df_i[datespre, "lon"],
ycoord = df_i[datespre, "lat"],
tcoord = df_i[datespre, "Date"],

The class of those objects was "data.table data.frame" instead of character vectors. Subsetting fail by me. This caused working_coords$tcoord1 to be a data.table data.frame with a "Date" column of dates, ratehr than a vector, so tcoord1 was NA

Any option to add the following, in your initial tests section, before "urlbase <- " ?


  if (!is.null(xcoord)) if (length(dim(xcoord)) > 0) stop("xcoord is not a vector")
  if (!is.null(ycoord)) if (length(dim(ycoord)) > 0) stop("ycoord is not a vector")
  if (!is.null(zcoord)) if (length(dim(zcoord)) > 0) stop("zcoord is not a vector")
  if (!is.null(tcoord)) if (length(dim(tcoord)) > 0) stop("tcoord is not a vector")

To prevent dumdums like me from making the same mistake in future! Thanks.

p.s. the script is now MUCH fast - massive hats off to you Roy!!

rmendels commented 4 years ago

Thanks for the suggestion. I have a lot of checks as it is. I will try to add something along that line when I have reason to make a new release. My biggest concerns have been to make 'rxtracto()' faster and more robust in the sense of returning what has been downloaded so far if there is a failure. I am glad you notice the speed difference. I still think I can do it better, just need some time away from the code.

rmendels commented 4 years ago

@SimonDedman Forgot to @ you. I am going to close this if that is okay. Thanks for the suggestion. I have a lot of checks as it is. I will try to add something along that line when I have reason to make a new release. My biggest concerns have been to make 'rxtracto()' faster and more robust in the sense of returning what has been downloaded so far if there is a failure. I am glad you notice the speed difference. I still think I can do it better, just need some time away from the code.