Open GrabYourPitchforks opened 1 month ago
Anecdata, I just got bitten by this in my code
What about not A and B
? I think same issue applies to not
and and
as well.
@hez2010 as mentioned, we would love full subsumption enforcement, but we have no evidence of that combination being a general problem.
Problem summary
I have observed significant ecosystem use of the
is not <a> or <b>
pattern matching syntax recently. However, it appears that much of the usage is incorrect due to developers misunderstanding the order of operations during evaluation.In colloquial usage, the phrase "if x is not y or z" is understood by most participants to mean "if x is neither y nor z." (That is:
¬(y ∨ z)
, or(¬y) ∧ (¬z)
.) Instead, in C#, the expression is understood by the compiler to mean(¬y) ∨ z
.In fairness, this mirrors the more traditional Boolean syntax
if (!y || z)
, where the!
operator has a higher precedence than the||
operator. However, there's evidence that developers largely do not expect this same precedence when using the new pattern matching syntax, perhaps because it throws their brains into colloquial English mode rather than software engineering mode.Evidence
I searched for the
not <x> or <y>
pattern across all first-party code bases using CodeQL and found 96 files which use the pattern, then I manually inspected each file to see whether the author intended the colloquial "neither / nor" meaning or the engineering "not / or" meaning. Of the 96 results, 85 of them indicate incorrect usage. That's an 89% error rate for developers using this syntax.I can't share exact internal data, but you can see this using normal GitHub scans as well.
is not null or ...
(link)is not "literal" or ...
(link)is not 0 or ...
(link)The top offenders seem to be these patterns:
is not null or <anything>
is not "literal string" or "different literal string"
is not MyEnum.SomeValue or MyEnum.OtherValue
is not LiteralInt or DifferentLiteralInt
There are variations, but they're a long tail compared to the above.
Proposal
I propose that the C# compiler produce a warning for any occurrence of
not a or b
unless the first expression is parenthesized like(not a) or b
. This should make it clear to developers what the interpreted order of operations will be.I don't propose warning on
!a || b
unless there is evidence this pattern trips people up. I have never seen such evidence myself, nor am I actively searching for such evidence.Further exploration
At the request of the C# team in a group chat, I also searched for occurrences of the
is not <x> and <y>
pattern. That is - we wondered if developers also frequently misinterpreted "not / and" as "neither / nor" - same as they were doing with "not / or". An internal scan showed only one or two questionable hits. The team may want to consider making this a warning as well for symmetry, but I am unable to provide evidence that "not / and" is causing developers to run into the same kind of issue.As a general consideration, if the developer writes
x or y
(orx || y
), and if the compiler can prove thaty ⇒ x
, then the compiler may want to consider emitting a warning, because it means the y component of those expressions will never evaluate to true.For example, given
is not null or ""
, an empty string implicitly satisfies not null. This basically means that you'll always end up withtrue || true
orfalse || false
. If we charitably assume that developers aren't frequently in the business of writing throwaway code, perhaps this is an indication that they expected something else to happen? (Same withis not 3 or 4
: a value being equal to 4 implicitly means that it's not 3.)I don't know if solving the generalized case (where it's even possible to do so!) is a good use of resources, but at least it helped me internally frame desirable / undesirable behavior when I was analyzing the 96 internal hits.