Closed jaseeverett closed 6 months ago
Actually, thinking further on this, I wonder if we can use group_by
in place of sf_col_layer_names
. What you are doing is exactly as the tidy verse thinks of group_by. You apply the calculation after grouping by that column.
This would be a huge improvement and more consistent language
These all sound sensible. I was about to implement, but for group_by
, I think we're not supposed to use function names as variable names right?
feature_names
might be best then?
Good point. I was thinking they implemented group_by
as an argument, but they haven't. They use .by
. (https://dplyr.tidyverse.org/reference/summarise.html). Is that better to be consistent - but it might be less clear.....
I've used group_by
for now since like you said, people who use sf and tidyverse will be familiar with this, but happy to revisit
I think you are right. It shouldn't be group_by
.
In summarise (and mutate etc) they use .by =
which I like. Happy to be guided by you
Just a few suggestions here for discussion:
sf_col_layer_names
is very long. I would suggest changing tofeature_names
. You use the term "feature" in the help and I think this is shorter and clearer. In the help if you putfeature_names [sf data only]
it will be apparent for users that this only applies to sfcov_fraction
is a bit of a misnomer. You aren't returning the coverage fraction, but rather what are you going to apply as a cutoff. I don't know if "cutoff" is universal, but our group would call this thecutoff
iecutoff = 0.7
to be marked as present in a PU).return_cov_frac
again this is a bit long, but it is exactly what use are doing. To flip the name upside down, perhaps it should beapply_cutoff
(TRUE/FALSE) (ie are you applying a cutoff to the data) or you aren't (in which case you will get the proportion returned).Just my 2c :-)