ropensci / taxa

taxonomic classes for R
https://docs.ropensci.org/taxa
Other
48 stars 12 forks source link

Add `exp` option that acts like `value` but with NSE #49

Open zachary-foster opened 7 years ago

zachary-foster commented 7 years ago

obs currently does this kind of thing:

> # Return values from a dataset instead of indexes
> ex_taxmap$obs("info", value = "name")
$`1`
[1] tiger cat   mole  human
Levels: cat human mole potato tiger tomato

$`2`
[1] tomato potato
Levels: cat human mole potato tiger tomato

$`3`
[1] tiger cat  
Levels: cat human mole potato tiger tomato

$`4`
[1] mole
Levels: cat human mole potato tiger tomato

...

It might be useful to do something like this instead:

ex_taxmap$obs("info", value = name)

Then, it would allow expressions:

ex_taxmap$obs("info", value = sample_1 + sample_2)

zachary-foster commented 7 years ago

Thinking about this more, the character input (value = "name") should also be accepted, since this is useful for when you have to loop over multiple columns rather than refer to them by name or want to use column index. If its too hard to make it accept both types, then perhaps a new option called exp could be added and either value or exp can be used, but not both.

ex_taxmap$obs("info", exp = sample_1 + sample_2) # OK

ex_taxmap$obs("info", value = "sample_1") # OK

ex_taxmap$obs("info", value = 3) # OK

ex_taxmap$obs("info", value = "sample_1", exp = sample_1 + sample_2) # ERROR

What do you think about having two options for this @sckott? It will be much easier to program than one that accepts both and probably less error prone.

sckott commented 7 years ago

hmm, the way i've seen it done and I've done it as well in some pkgs, is to have two fxns, one like foo and another foo_ - where foo() can do e.g. foo(x = bar) and the other does foo_(x = "bar") - and foo_ is called from within foo - or maybe it's better to have parameters as you say above

zachary-foster commented 7 years ago

Oh yea, dplyr does that, although I think that was removed in the latest version? I did not think of doing that. Something to consider. I never got the hang of using the two function setup myself.

I hesitate to do that because, I already have the functions like filter_taxa accepts both NSE and SE in the same option, so for consistency, those would perhaps have to be changed to have a filter_taxa_ version. The reason I cannot apply the same logic here is because the value and proposed exp options expect different input. The filter for filter_taxa, is like exp: it just evaluates an expression, with preference to using variables from all_names() over those in global; whatever it evaluates to (ids, indexes, T/F) is what is used to filter. value is expecting a data set name from all_names(). If they was one option in two functions (NSE/SE), the NSE of value would be refer to the contents of a column where the SE value would refer to column names. I find this confusing, and I am not sure where would fit inputing an external variable using SE. Also, I am thinking about letting the subset option use NSE like filter_taxa; what if a user wanted NSE for subset, but not for value or visa versa?

Are you ok with the two-option setup? That seems the most flexible to me.

sckott commented 7 years ago

the two parameter option sounds good