Open matteodefelice opened 5 years ago
I agree @matteodefelice, I am having the same issue.
This is highlighted by the following example in the documentation:
tidync(filename) %>% activate("JULD") %>%
hyper_filter(N_PROF = N_PROF == 1) %>%
hyper_tibble()
#> Class: tidync_data (list of tidync data arrays)
#> Variables (1): 'SCIENTIFIC_CALIB_DATE'
#> Dimension (4): DATE_TIME,N_PARAM,N_CALIB,N_PROF (14, 7, 1, 1)
#> Source: C:/Users/wraseman/Documents/R/win-library/3.6/tidync/extdata/argo/MD5903593_001.nc
The line Variables (1):
should read Variables (1); 'JULD'
This is the result that one gets when using activate(JULD)
instead of activate("JULD")
:
tidync(filename) %>% activate(JULD) %>%
hyper_filter(N_PROF = N_PROF == 1) %>%
hyper_tibble()
#> Class: tidync_data (list of tidync data arrays)
#> Variables (1): 'JULD'
#> Dimension (1): N_PROF (1)
#> Source: C:/Users/wraseman/Documents/R/win-library/3.6/tidync/extdata/argo/MD5903593_001.nc
When using the string input, it seems to default to grid [1]
So
This works:
filename <- system.file("extdata/argo/MD5903593_001.nc", mustWork = TRUE, package = "tidync")
tidync(filename) %>% activate(JULD)
But this does not:
filename <- system.file("extdata/argo/MD5903593_001.nc", mustWork = TRUE, package = "tidync")
tidync(filename) %>% activate("JULD")
A question is, should the second form also apply variable select (as per select_var
), because the variable could be multiple var names. I'm inclined not to: https://github.com/ropensci/tidync/pull/100
If multiple variable names are given only the first is used.
Actually, I think I might not do this - another option could be
tidync(filename) %>% hyper_tibble(select_var = varname)
I think that would be better, though it doesn't work atm. I need to have a think and another look. Appreciate thoughts!
@mdsumner, thanks for your response on this. I think it would be more intuitive to use an activate()
function in both cases (rather than doing it through hyper_tibble(select_var = varname)
). I think this could be achieved by creating a new function called activate_string()
which handles variables passed as strings. This is consistent with passing aesthetic properties in ggplot:
source: https://ggplot2.tidyverse.org/reference/aes_.html
aes(mpg, wt, col = cyl)
#> Aesthetic mapping:
#> * `x` -> `mpg`
#> * `y` -> `wt`
#> * `colour` -> `cyl`
aes_string("mpg", "wt", col = "cyl")
#> Aesthetic mapping:
#> * `colour` -> `cyl`
#> * `x` -> `mpg`
#> * `y` -> `wt`
This change would mean the following code would give identical results:
# Passing a variable
filename <- system.file("extdata/argo/MD5903593_001.nc", mustWork = TRUE, package = "tidync")
tidync(filename) %>% activate(JULD)
# Passing the variable as a string
filename <- system.file("extdata/argo/MD5903593_001.nc", mustWork = TRUE, package = "tidync")
tidync(filename) %>% activate_string("JULD")
But what about the question, should activating via a variable string pass that in to select_var also? It's no problem to code it, but activate and select are doing different things and there are other implications I want to think about
I was having a hard time understanding what you meant by that but I think I get it now.
If I understand it correctly, hyper_tibble()
implicitly creates a tibble for whatever variable is passed to activate()
. For instance, this first example should give the same tibble and the second:
tidync(filename) %>% activate(SCIENTIFIC_CALIB_COEFFICIENT) %>% hyper_tibble()
tidync(filename) %>% activate(SCIENTIFIC_CALIB_COEFFICIENT) %>% hyper_tibble(select_var = SCIENTIFIC_CALIB_COEFFICIENT)
However, since there are multiple variables in the active grid (SCIENTIFIC_CALIB_EQUATION, SCIENTIFIC_CALIB_COEFFICIENT, and SCIENTIFIC_CALIB_COMMENT), the user could specify the active grid with any of these variables but choose the variable using select_var
like this:
# if the user wants to view data for "SCIENTIFIC_CALIB_EQUATION"
tidync(filename) %>% activate(SCIENTIFIC_CALIB_COEFFICIENT) %>% hyper_tibble(select_var = SCIENTIFIC_CALIB_EQUATION)
If that is the case, then yes, I think that activating the grid using a variable string should also pass that information to hyper_tibble()
. In that case, the user could do the following and get the same results:
tidync(filename) %>% activate("SCIENTIFIC_CALIB_COEFFICIENT") %>% hyper_tibble()
tidync(filename) %>% activate("SCIENTIFIC_CALIB_COEFFICIENT") %>% hyper_tibble(select_var = "SCIENTIFIC_CALIB_COEFFICIENT")
It does seem a bit odd to accept either a string or a variable for the same function, so that was why I thought about creating a separate function like activate_string()
but then I see how that would lead to difficulties down the road.
I'm not sure if I answered your question, I'm still new to tidync, so my apologies if not!
@matteodefelice, do you have any thoughts?
Activate is for grids not variables. I just thought it was handy to pick out a grid via a nominal variable name, but I've always been uncomfortable about it.
I think I should write a bit about this in more detail
I agree, I was a bit confused about using a variable to activate the grid when I learned about the activate() function. Let me know if you need any more thoughts!
I am currently using (and developing) R code to analyse an extensive set of power system simulations which output has been saved in NetCDF. Each simulation has 27 different grids, in total ~80 different variables. I have developed some functions post-processing those outputs and I have tried to generalise as much as possible, that's why I needed the possibility to pass as a function argument the variable I needed to "extract". Currently I can generalise within a single grid, so if the field I need is stored in field_name
I activate the grid and then I use dplyr::select
using field_name
and then rlang::sym
when is needed. I don't like this, because if the grid name changes my code stops working because I need to manually encode the grid name in my functions. Maybe there is a better way to do this, however the string-based solution suggested by @wraseman looks nice.
I plan to share my code as a R package to post-process the outputs of the open source power system model I am using (Dispa-SET, www.dispa-set.eu).
Hi there, I know this is a very old issue but I'm encountering the same problem as @matteodefelice. Have you figured out a workaround in the meantime? Thanks a lot!
I think I'll do this
select_var
)I think that makes sense, because grids aren't identified in netcdf it's a bit of a pain and these weird names are a problem ;)
actually, this already works
@aodenweller can you show an example of what you want?
Activation works fine when I'm using an unquoted variable name, but not when I'm using the variable name stored as a string. I'm assuming this is due to what_name <- deparse(substitute(what))
in activate.R
.
I think this works with select_var if that helps as a workaround, but, there are inconsistencies I'll try to fix 🙏
it's possible will be much simpler with rlang now, but still a bit uncomfortable conflating activate with var select so maybe a new function would be better
I'm finding that if I use nc <- nc %>% activate('salt')
I can then do nc %>% hyper_tibble('salt')
but not nc %>%hyper_tibble('temperature')
.
I think this is as intended, but I hoped this would work:
nc <- nc %>% activate('salt', select_var = c('salt','temperature'))
Unfortunately, it doesn't.
To be able to access temperature as well as salt, I have to do the following, which is not obvious:
nc <- tidync::tidync(input_file) %>% tidync::activate('salt') %>% tidync::activate(tidync::active(nc), select_var = c('salt','temperature')
i.e., activate by variable name, then use active()
to get the grid name, then activate again, this time by grid name.
I know this has been discussed here: https://github.com/ropensci/tidync/issues/26 but maybe the things have changed. I want to activate a variable specifying it in a string. I wasn't able to do that but instead I could with the name of the grid. This is my grid:
I do this:
Well, this is what happens:
With the name of the grid it works, instead with the variable name (associated to that grid) I get the first grid: