Open EricPrideaux opened 5 years ago
This is a tricky one. I'll have to dive into it a little bit to see what's going on. The ~
usage on the symbolic is one of the more complicated parts of the code and it's been awhile since I wrote that.
Ok so this is definitely a bug, but I'm gonna need to think about how I'll fix it. Essentially the problem is that the inversion is not propagating through properly in the chain of operations, and unfortunately it's not a trivial fix as far as I can tell right now. I'll let you know when I come up with a solution.
Hi Kieferk, Many thanks for your update. I look forward to your solution and will keep an eye out!
Just wanted to chime in that I have also come across this bug, same scenario when using mask
except my case was e.g. mask(X.bool_col1 & (~X.bool_col2))
Also wanted to add that in the case of &
, you can use mask(condA, ~condB)
, and alternatively, the -
sign for inversion also works, e.g. mask(condA & -condB)
Sorry I've been inactive for awhile since work has been very busy. I am going to dive back in and try to tackle this over the weekend.
I am hoping I can resolve this "elegantly" but from what I can see it may require some substantial code re-writing. I'll keep you posted.
interestingly, passing the invert operator to make_symbolic
results in correct behavior (fwiw):
from operator import inv # inv(x) == ~x
df['a'].isnull() | (~df['b'].isnull())
# m
# 0 True
# 1 True
# 2 True
# 3 True
# 4 False
df >> transmute(m = X.a.isnull() | inv(X.b.isnull()))
# m
# 0 True
# 1 False
# 2 False
# 3 False
# 4 True
df >> transmute(m = X.a.isnull() | make_symbolic(inv)(X.b.isnull()))
# m
# 0 True
# 1 True
# 2 True
# 3 True
# 4 False
Hi kieferk,
My friends and I are very excited and thankful when encounting the dplyr-style package.
We use filter_by
a lot in filting chinese by boolean values.
We look forward to your solution for this Boolean bug.
Hi kieferk,
I am an R user learning how to use
dfply
. I may have spotted an issue: it appears that Boolean~
isn't evaluated after Boolean|
if applied in the syntax below.My code:
Here is the original data frame, df:
And here is the result of the piped mask, df2:
However, I expect this instead:
I don't understand why the
|
and~
operators result in rows in which column "a" is eitherNaN
or column "b" is notNaN
?By the way, I also tried
np.logical_or()
:But this resulted in error: