dodger487 / dplython

dplyr for python
MIT License
764 stars 58 forks source link

Add filter joins #60

Closed bleearmstrong closed 8 years ago

bleearmstrong commented 8 years ago

Add filtering joins. While I was working on implementing spread(), I realized the functions didn't work quite properly on grouped data. As of this pull request, grouping is removed when data is joined. In some cases, this makes sense; we can think of a mutating join as creating a new dataframe, so maybe grouping should be removed. For filtering joins, maybe not. For spread and gather, maybe not. How to deal with grouping should be discussed, not just for joins but for other functions. Should that be discussed here or is there somewhere else that is more appropriate?

dodger487 commented 8 years ago

@bleearmstrong interesting thought on the grouping-- the best place to discuss that would be in the issues. You should open an issue describing the problem ("How should joins and tidy verbs handle grouped data frames?"), add some background, and perhaps mention the dplyr behavior.

bleearmstrong commented 8 years ago

I've adjusted some code so that x >> join(y) will maintain x's grouping. Should I add that to this request or a future request? I'd prefer to add it to future request, so I could tackle several issues involved with grouping in a single pull request.

dodger487 commented 8 years ago

@bleearmstrong A few comments:

bleearmstrong commented 8 years ago

I'm probably going to cancel this pull request and move the mutating joins over to verbs before I do work on the filtering verbs.