Closed rmendels closed 2 years ago
Thanks @rmendels . A few questions.
does info()
give this already?
library(rerddap)
x <- info('erdATastnhday')
x$alldata$time
#> row_type variable_name attribute_name data_type value
#> 1 dimension time double nValues=31108, evenlySpaced=false, averageSpacing=4h 10m 52s
#> 2 attribute time _CoordinateAxisType String Time
#> 3 attribute time actual_range double 1.13609088E9, 1.60432788E9
#> 4 attribute time axis String T
#> 5 attribute time fraction_digits int 0
#> 6 attribute time ioos_category String Time
#> 7 attribute time long_name String Centered Time
#> 8 attribute time standard_name String time
#> 9 attribute time time_origin String 01-JAN-1970 00:00:00
#> 10 attribute time units String seconds since 1970-01-01T00:00:00Z
What I meant as "parameters" is what you are calling "fields" already in rerddap
. So for example time, latitude, longitude are both dimension sizes and arrays with values. The idea is to be able to request some combination of those without requesting a field. It may make more sense to make this a separate function to keep the other syntax cleaner to download fields.
Okay, now I understand that parameters=fields.
I"m still not clear on what you want to be returned to the user. Is what you are seeking the returned data.frame from griddap()
, but without the parameters/fields? So separate rows for each time/lat/lon/etc?
Unfortunately, the current reader of netcdf files https://github.com/ropensci/rerddap/blob/master/R/ncdf_helpers.R parses the netcdf to a data.frame, and it doesn't support having zero parameters/fields.
In this case the "field name" would be say "latitude" or "time", and that type of request works just fine in ERDDAP. ERDDAP can return netcdf files that just have dimension names, sizes and values. I am in the midst if something, later on I will attach a file with the ERDDAP call that created it. But this is why I think a separate function would be better
okay the following two URL's produce the following two netcdf files. My guess is your reader can read them, with perhaps some modifications. I may even try that later on.
https://coastwatch.pfeg.noaa.gov/erddap/griddap/erdMBsstd1day.nc?time[(2020-11-01T12:00:00Z):1:(2020-11-01T12:00:00Z)] https://coastwatch.pfeg.noaa.gov/erddap/griddap/erdMBsstd1day.nc?time[(2020-11-01T12:00:00Z):1:(2020-11-01T12:00:00Z)],latitude[(-45.0):1:(65.0)],longitude[(120.0):1:(320.0)]
Well I just found out I can attach the netcdf files here, they are small, I will email them to you.
thanks. On a branch, I've made a change to using tidync
package. Install like remotes::install_github("ropensci/rerddap@tidync")
. Then try for example:
the internal fxn (not exported) to read a nc file (this is one from the links above)
library(rerddap)
x <- 'erdMBsstd1day_fbf7_057a_6582.nc'
rerddap:::tidync_read(x)
#> $data
#> # A tibble: 1 x 1
#> time
#> <dbl>
#> 1 1604232000
#>
#> $dims
#> # A tibble: 1 x 7
#> name length start count id unlim coord_dim
#> <chr> <dbl> <int> <int> <int> <lgl> <lgl>
#> 1 time 1 1 1 0 FALSE TRUE
#>
#> $vars
#> # A tibble: 1 x 6
#> id name type ndims natts dim_coord
#> <int> <chr> <chr> <int> <int> <lgl>
#> 1 0 time NC_DOUBLE 1 9 TRUE
And with griddap
fxn, e.g.,
griddap('erdVHNchlamday',
time = c('2015-04-01','2015-04-10'),
latitude = c(18, 21),
longitude = c(-120, -119), fields="none"
)
#> <ERDDAP griddap> erdVHNchlamday
#> Path: [/Users/sckott/Library/Caches/R/rerddap/e179b1f3bcdf26e6da3ee5cd703c524e.nc]
#> Last updated: [2020-11-05 08:49:27]
#> File size: [0.01 mb]
#> Dimensions (dims/vars): [1 X 1]
#> Dim names: time
#> Variable names: time
#> data.frame (rows/columns): [1 X 1]
#> # A tibble: 1 x 1
#> time
#> <dbl>
#> 1 1429142400
is that what you're looking for?
Yes what I am looking for. Will look at it some more today or tomorrow when I get a chance and hopefully am a little less stressed. Is tidync
going to be used everywhere? I don't know the insides of tidync
, will that still allow for the recovery of the netcdf file if the user desires that? My packages use that, and in the classes we have had a fair number of people ask how to keep the downloaded netcdf file, not just have it in R
.
@sckott Have done some testing. So far so good. So far works with my packages. Thanks
Great, glad it works!
I don't know how you access the path to the netcdf file right now, but even with the change to tidync the path to the file is still available at attr(x, "path")
if x
is the output of griddap()
function call. Does that work for you?
I am doing something like that (info about the path is embedded in the return), don't remember exactly how I find the path, but it has all been working, so it changed nothing in terms of my packages, at least so far.
okay, sounds good
@sckott Whoops. I forgot to test plotddap
. The new structure is significantly different than the old structure, below shows the difference. plotdap
makes significant use of what is in the summary part of the structure. I would need to go through and see if the same information is contained in the new structure, and if so figure out what changes are needed. But as now stands, this will break the plotdap
package. So it would be important when and if this is submitted, that new versions are submitted at the same time, and that CRAN is warned that things will break until both packages are updated.
murSST_west <- griddap(
'jplMURSST41',
latitude = c(22, 51),
longitude = c(-140, -105),
time = c('last', 'last'),
fields = 'analysed_sst'
)
# old return
str(murSST_west)
List of 2
$ summary:List of 14
..$ filename : chr "/Users/rmendels/Library/Caches/R/rerddap/11c0739289d10232bb18e20dba019f9a.nc"
..$ writable : logi FALSE
..$ id : int 65536
..$ safemode : logi FALSE
..$ format : chr "NC_FORMAT_CLASSIC"
..$ is_GMT : logi FALSE
..$ groups :List of 1
.. ..$ :List of 7
.. .. ..$ id : int 65536
.. .. ..$ name : chr ""
.. .. ..$ ndims: int 3
.. .. ..$ nvars: int 4
.. .. ..$ natts: int 50
.. .. ..$ dimid: int [1:3(1d)] 0 1 2
.. .. ..$ fqgn : chr ""
.. .. ..- attr(*, "class")= chr "ncgroup4"
..$ fqgn2Rindex:List of 1
.. ..$ : int 1
..$ ndims : num 3
..$ natts : num 50
..$ dim :List of 3
.. ..$ time :List of 10
.. .. ..$ name : chr "time"
.. .. ..$ len : int 1
.. .. ..$ unlim : logi FALSE
.. .. ..$ group_index : int 1
.. .. ..$ group_id : int 65536
.. .. ..$ id : int 0
.. .. ..$ dimvarid :List of 5
.. .. .. ..$ id : int 0
.. .. .. ..$ group_index: int 1
.. .. .. ..$ group_id : int 65536
.. .. .. ..$ list_index : num -1
.. .. .. ..$ isdimvar : logi TRUE
.. .. .. ..- attr(*, "class")= chr "ncid4"
.. .. ..$ units : chr "seconds since 1970-01-01T00:00:00Z"
.. .. ..$ vals : num [1(1d)] 1.6e+09
.. .. ..$ create_dimvar: logi TRUE
.. .. ..- attr(*, "class")= chr "ncdim4"
.. ..$ latitude :List of 10
.. .. ..$ name : chr "latitude"
.. .. ..$ len : int 2901
.. .. ..$ unlim : logi FALSE
.. .. ..$ group_index : int 1
.. .. ..$ group_id : int 65536
.. .. ..$ id : int 1
.. .. ..$ dimvarid :List of 5
.. .. .. ..$ id : int 1
.. .. .. ..$ group_index: int 1
.. .. .. ..$ group_id : int 65536
.. .. .. ..$ list_index : num -1
.. .. .. ..$ isdimvar : logi TRUE
.. .. .. ..- attr(*, "class")= chr "ncid4"
.. .. ..$ units : chr "degrees_north"
.. .. ..$ vals : num [1:2901(1d)] 22 22 22 22 22 ...
.. .. ..$ create_dimvar: logi TRUE
.. .. ..- attr(*, "class")= chr "ncdim4"
.. ..$ longitude:List of 10
.. .. ..$ name : chr "longitude"
.. .. ..$ len : int 3501
.. .. ..$ unlim : logi FALSE
.. .. ..$ group_index : int 1
.. .. ..$ group_id : int 65536
.. .. ..$ id : int 2
.. .. ..$ dimvarid :List of 5
.. .. .. ..$ id : int 2
.. .. .. ..$ group_index: int 1
.. .. .. ..$ group_id : int 65536
.. .. .. ..$ list_index : num -1
.. .. .. ..$ isdimvar : logi TRUE
.. .. .. ..- attr(*, "class")= chr "ncid4"
.. .. ..$ units : chr "degrees_east"
.. .. ..$ vals : num [1:3501(1d)] -140 -140 -140 -140 -140 ...
.. .. ..$ create_dimvar: logi TRUE
.. .. ..- attr(*, "class")= chr "ncdim4"
..$ unlimdimid : num -1
..$ nvars : num 1
..$ var :List of 1
.. ..$ analysed_sst:List of 22
.. .. ..$ id :List of 5
.. .. .. ..$ id : num 3
.. .. .. ..$ group_index: num -1
.. .. .. ..$ group_id : int 65536
.. .. .. ..$ list_index : num 1
.. .. .. ..$ isdimvar : logi FALSE
.. .. .. ..- attr(*, "class")= chr "ncid4"
.. .. ..$ name : chr "analysed_sst"
.. .. ..$ ndims : int 3
.. .. ..$ natts : int 11
.. .. ..$ size : int [1:3] 3501 2901 1
.. .. ..$ dimids : int [1:3] 2 1 0
.. .. ..$ prec : chr "double"
.. .. ..$ units : chr "degree_C"
.. .. ..$ longname : chr "Analysed Sea Surface Temperature"
.. .. ..$ group_index : int 1
.. .. ..$ chunksizes : logi NA
.. .. ..$ storage : num 1
.. .. ..$ shuffle : logi FALSE
.. .. ..$ compression : logi NA
.. .. ..$ dims : list()
.. .. ..$ dim :List of 3
.. .. .. ..$ :List of 10
.. .. .. .. ..$ name : chr "longitude"
.. .. .. .. ..$ len : int 3501
.. .. .. .. ..$ unlim : logi FALSE
.. .. .. .. ..$ group_index : int 1
.. .. .. .. ..$ group_id : int 65536
.. .. .. .. ..$ id : int 2
.. .. .. .. ..$ dimvarid :List of 5
.. .. .. .. .. ..$ id : int 2
.. .. .. .. .. ..$ group_index: int 1
.. .. .. .. .. ..$ group_id : int 65536
.. .. .. .. .. ..$ list_index : num -1
.. .. .. .. .. ..$ isdimvar : logi TRUE
.. .. .. .. .. ..- attr(*, "class")= chr "ncid4"
.. .. .. .. ..$ units : chr "degrees_east"
.. .. .. .. ..$ vals : num [1:3501(1d)] -140 -140 -140 -140 -140 ...
.. .. .. .. ..$ create_dimvar: logi TRUE
.. .. .. .. ..- attr(*, "class")= chr "ncdim4"
.. .. .. ..$ :List of 10
.. .. .. .. ..$ name : chr "latitude"
.. .. .. .. ..$ len : int 2901
.. .. .. .. ..$ unlim : logi FALSE
.. .. .. .. ..$ group_index : int 1
.. .. .. .. ..$ group_id : int 65536
.. .. .. .. ..$ id : int 1
.. .. .. .. ..$ dimvarid :List of 5
.. .. .. .. .. ..$ id : int 1
.. .. .. .. .. ..$ group_index: int 1
.. .. .. .. .. ..$ group_id : int 65536
.. .. .. .. .. ..$ list_index : num -1
.. .. .. .. .. ..$ isdimvar : logi TRUE
.. .. .. .. .. ..- attr(*, "class")= chr "ncid4"
.. .. .. .. ..$ units : chr "degrees_north"
.. .. .. .. ..$ vals : num [1:2901(1d)] 22 22 22 22 22 ...
.. .. .. .. ..$ create_dimvar: logi TRUE
.. .. .. .. ..- attr(*, "class")= chr "ncdim4"
.. .. .. ..$ :List of 10
.. .. .. .. ..$ name : chr "time"
.. .. .. .. ..$ len : int 1
.. .. .. .. ..$ unlim : logi FALSE
.. .. .. .. ..$ group_index : int 1
.. .. .. .. ..$ group_id : int 65536
.. .. .. .. ..$ id : int 0
.. .. .. .. ..$ dimvarid :List of 5
.. .. .. .. .. ..$ id : int 0
.. .. .. .. .. ..$ group_index: int 1
.. .. .. .. .. ..$ group_id : int 65536
.. .. .. .. .. ..$ list_index : num -1
.. .. .. .. .. ..$ isdimvar : logi TRUE
.. .. .. .. .. ..- attr(*, "class")= chr "ncid4"
.. .. .. .. ..$ units : chr "seconds since 1970-01-01T00:00:00Z"
.. .. .. .. ..$ vals : num [1(1d)] 1.6e+09
.. .. .. .. ..$ create_dimvar: logi TRUE
.. .. .. .. ..- attr(*, "class")= chr "ncdim4"
.. .. ..$ varsize : int [1:3] 3501 2901 1
.. .. ..$ unlim : logi FALSE
.. .. ..$ make_missing_value: logi TRUE
.. .. ..$ missval : num -7.77
.. .. ..$ hasAddOffset : logi FALSE
.. .. ..$ hasScaleFact : logi FALSE
.. .. ..- attr(*, "class")= chr "ncvar4"
$ data :'data.frame': 10156401 obs. of 4 variables:
..$ time : chr [1:10156401] "2020-11-08T09:00:00Z" "2020-11-08T09:00:00Z" "2020-11-08T09:00:00Z" "2020-11-08T09:00:00Z" ...
..$ lat : num [1:10156401] 22 22 22 22 22 22 22 22 22 22 ...
..$ lon : num [1:10156401] -140 -140 -140 -140 -140 ...
..$ analysed_sst: num [1:10156401] 24.2 24.2 24.2 24.3 24.3 ...
- attr(*, "class")= chr [1:3] "griddap_nc" "nc" "list"
- attr(*, "datasetid")= chr "jplMURSST41"
- attr(*, "path")= chr "/Users/rmendels/Library/Caches/R/rerddap/11c0739289d10232bb18e20dba019f9a.nc"
- attr(*, "url")= chr "https://upwell.pfeg.noaa.gov/erddap/griddap/jplMURSST41.nc?analysed_sst[(last):1:(last)][(22):1:(51)][(-140):1:(-105)]"
# new return
str(murSST_west)
tibble [0 × 3] (S3: griddap_nc/tbl_df/tbl/data.frame)
$ data: tibble [6,140,613 × 4] (S3: tbl_df/tbl/data.frame)
..$ analysed_sst: num [1:6140613] 24.2 24.2 24.2 24.3 24.3 ...
..$ longitude : num [1:6140613] -140 -140 -140 -140 -140 ...
..$ latitude : num [1:6140613] 22 22 22 22 22 22 22 22 22 22 ...
..$ time : num [1:6140613] 1.6e+09 1.6e+09 1.6e+09 1.6e+09 1.6e+09 ...
$ dims: tibble [3 × 7] (S3: tbl_df/tbl/data.frame)
..$ name : chr [1:3] "longitude" "latitude" "time"
..$ length : num [1:3] 3501 2901 1
..$ start : Named int [1:3] 1 1 1
.. ..- attr(*, "names")= chr [1:3] "longitude" "latitude" "time"
..$ count : Named int [1:3] 3501 2901 1
.. ..- attr(*, "names")= chr [1:3] "longitude" "latitude" "time"
..$ id : int [1:3] 2 1 0
..$ unlim : logi [1:3] FALSE FALSE FALSE
..$ coord_dim: logi [1:3] TRUE TRUE TRUE
$ vars: tibble [1 × 6] (S3: tbl_df/tbl/data.frame)
..$ id : int 3
..$ name : chr "analysed_sst"
..$ type : chr "NC_DOUBLE"
..$ ndims : int 3
..$ natts : int 11
..$ dim_coord: logi FALSE
- attr(*, "datasetid")= chr "jplMURSST41"
- attr(*, "path")= chr "/Users/rmendels/Library/Caches/R/rerddap/11c0739289d10232bb18e20dba019f9a.nc"
- attr(*, "url")= chr "https://upwell.pfeg.noaa.gov/erddap/griddap/jplMURSST41.nc?analysed_sst[(last):1:(last)][(22):1:(51)][(-140):1:(-105)]"
@sckott If nothing else, the class attributes are missing. plotdap
tests that file return is of class "griddap_nc" to know that it is a proper file to plot. A quick scan looks like the main use was to pull off the actual coordinate info from the summary, and the rerddap
summary I believe is what ncdf4
returns as a summary. The info is there, just would need to go through and make certain that I have found all the places.
In rerddapXtracto
i hadn't tested the plotting functions either. The new versions will also break those, because I make the structure that 'rerddapXtracto::rxtracto_3D()' returns to look like it came from 'rerddap'.
So after further testing, the new return structure breaks the present versions of both plotdap
and rerddapXtracto
. I will see if I can get new versions that work, but may take awhile, have some other things going on, also pretty unfocused right now
@sckott In trying to get plotdap
to work, I have been in debug mode to look at values being passed, and then noticed, which I should have noticed before, is that the melted dimensions differ in the two cases, as they do above in the two responses I posted above. The melted dimension of 10156401 in the example above in the present code is correct. The melted dimension of 6140613 in the tidync use is incorrect. For the wind example in the plotdap
vignette I get similar differences, where by looking at the dimensions in the file itself I can say the present code has the correct size for the melted data, the tidync based code does not (the melted dimension should be the product of the sizes of the different coordinate dimensions). While there were other problems in the present version of plotdap
that I believe I have fixed for this new structure, that fact that the over data length doesn't match the product of the dimensions blows everything up, this is in various calls to raster
.
HTH
@sckott just looked at tidync
. Noticed that all the 'hyper_xx()' functions have "na.rm = TRUE" as the default. I suspect that is what is causing the difference, and ruins the raster aspect of the data. To be consistent with the present structures returned by rerddap
, and for the melted length to equal the product of the dimensions, try in whatever you are doing to set "na.rm = FALSE", and see if that produces a consistent result.
@sckott - also I can't follow the example you give in the other issue where you are using "purr" and some other things, what that accomplishes. I do know for the wind example in the plotdap
vignette, if I copy over the netcdf file and do:
library(tidync)
wind_file <- tidync('~rmendels/Desktop/wind.nc')
wind_tibble <- hyper_tibble(wind_file, na.rm = FALSE)
str(wind_tibble)
tibble [29,403 × 5] (S3: tbl_df/tbl/data.frame)
$ y_wind : num [1:29403] 1.087 0.892 0.686 0.666 0.976 ...
$ longitude: num [1:29403] 210 210 210 211 211 ...
$ latitude : num [1:29403] 30 30 30 30 30 30 30 30 30 30 ...
$ altitude : num [1:29403] 10 10 10 10 10 10 10 10 10 10 ...
$ time : num [1:29403] 1.46e+09 1.46e+09 1.46e+09 1.46e+09 1.46e+09 ...
that is correct. I think returning the structure like you had above is a good idea, so I tried you code with the one change:
temp <- purrr::map(wind_file$grid$grid, ~activate(wind_file, .x) %>% hyper_tibble(na.rm = FALSE))
str(temp)
List of 5
$ : tibble [29,403 × 5] (S3: tbl_df/tbl/data.frame)
..$ y_wind : num [1:29403] 1.087 0.892 0.686 0.666 0.976 ...
..$ longitude: num [1:29403] 210 210 210 211 211 ...
..$ latitude : num [1:29403] 30 30 30 30 30 30 30 30 30 30 ...
..$ altitude : num [1:29403] 10 10 10 10 10 10 10 10 10 10 ...
..$ time : num [1:29403] 1.46e+09 1.46e+09 1.46e+09 1.46e+09 1.46e+09 ...
$ : tibble [3 × 1] (S3: tbl_df/tbl/data.frame)
..$ time: num [1:3] 1.46e+09 1.46e+09 1.47e+09
$ : tibble [1 × 1] (S3: tbl_df/tbl/data.frame)
..$ altitude: num 10
$ : tibble [81 × 1] (S3: tbl_df/tbl/data.frame)
..$ latitude: num [1:81] 30 30.2 30.5 30.8 31 ...
$ : tibble [121 × 1] (S3: tbl_df/tbl/data.frame)
..$ longitude: num [1:121] 210 210 210 211 211 ...
My one problem with this is it usually isn't preferable to use pipes in a package, but if it works and makes your life easier .....
HTH
Thanks much @rmendels and sorry for the delay. That use of purrr was not my idea, was copied from code given by someone else, I also don't like use of pipes in packages, so no worries there.
I'll try to get back to rerddap soon
Allow the user to extract the values of one or more of the coordinate dimensions without having to get the parameter data. This for example would allow a user to extract the times and check what is there.