tidyverse / dplyr

dplyr: A grammar of data manipulation
https://dplyr.tidyverse.org/
Other
4.74k stars 2.12k forks source link

Get rid of conflicts with the "stats" package #2195

Closed krlmlr closed 7 years ago

krlmlr commented 7 years ago

Currently only filter and lag. Suggestions:

Deprecation first via messages (once per session), later via warnings (once per session), much later by removing. I'd argue it's worth the effort.

lionel- commented 7 years ago

maybe keep_where()? And drag() for lag? Though it feels wrong to change the reference to lag as it is an established modelling term.

vlag() (for vector lag)

If I'm not mistaken the vector prefix will be vec_ in vctrs, with atm_ for atomic vectors and lst_ for generic vectors.

krlmlr commented 7 years ago

keep_where() is longish too, but how about keep() (with an optional alias to keep_where())?

vec_lag()?

lionel- commented 7 years ago

keep() is a function in purrr :(

krlmlr commented 7 years ago

Oh. But then keep_rows() looks more natural to me.

burkovae commented 7 years ago

having or being maybe ? if select specifies rows, then one could naturally say select stuff having id == x.

krlmlr commented 7 years ago

@hadley: Does any of these options appeal to you?

hadley commented 7 years ago

TBH, no. I think it's too late to change.

krlmlr commented 7 years ago

Both stats::filter() and stats::lag() operate on time series. We could do something nice if the user calls us with a time series object. ?

hadley commented 7 years ago

I like that idea!

bertrandmarc commented 6 years ago

Dear dplyr authors,

Could you please reconsider how you handle overwriting stats::lag? Loading dplyr makes it very difficult to use the stats::lag function in econometrics formula, as I would need to rewrite all my models and it would make them a lot more difficult to read.

Would it be possible for you to forward time series to stats::lag or use an S3 method?

Best regards, Bertrand

lionel- commented 6 years ago

@beberking It is too late to reconsider this decision but see https://github.com/r-lib/conflicted for some experiments to make it easier to deal with symbol conflicts.

About forwarding ts objects to stats::lag() this would probably cause more harm because we can't redirect simple objects anyway and there would be more chance that users are unaware of the masking.

hadley commented 6 years ago

Also you can do lag <- stats::lag so it’s not hard to work around.

lionel- commented 6 years ago

Though lag is a likely candidate for a variable name so doing it this way makes scripts brittle.

bertrandmarc commented 6 years ago

Thank you for your answer. I am sorry to insist, but I don't see why dplyr couldn't at least try to keep something based on S3 methods, which would play much nicer with conflicts to my mind.

For instance, what do you think about overriding only lag.default instead of lag, so the user would be able to define properly lag.ts as he/she wishes? It would still override the stats::lag behaviour by default, but it would be more flexible for the users, especially those interested in time series.

Best regards, Bertrand

PS for instance, a good candidate for lag.ts <- stats:::lag.default

hadley commented 6 years ago

We have tried in the past and it does not allow us the flexibility needed. It was a mistake but I believe that changing our approach now would have a larger negative effect than leaving it as is. Regardless, it is not up for discussion, and even if it was, comments on a closed issue is not the way to handle it.

lock[bot] commented 5 years ago

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/