ceylon / ceylon.language

DEPRECATED
Apache License 2.0
152 stars 57 forks source link

Null elements in collections #131

Closed ikasiuk closed 10 years ago

ikasiuk commented 11 years ago

When designing Ceylon collections we implicitly made a decision that might be worth discussing explicitly. null values in collections are not completely unproblematic. In particular there are two problems:

There are two solutions for this. The first one is the one wee took:

And this is the second one:

I'm not sure which of the two solutions is better. They both have advantages as well as downsides. So I'm not really suggesting to change anything, but it would be nice to hear some opinions.

chochos commented 11 years ago

Perhaps first and last should return Element|Finished instead of Element?; on collections that accept null, Element will already be an optional type.

quintesse commented 11 years ago

I was wondering the same as @chochos . Another option would be to go the Java route and have any attempt to access elements outside of the legal range throw exception.

ikasiuk commented 11 years ago

Perhaps first and last should return Element|Finished instead of Element?

That's a possibility. But you can't use if(exists) to check for Finished, so using first, last and item is then much less elegant in many situations. Of course you could still use if(exists) to check for null elements, and you can't use if(exists) with nullElement either. But I would expect that collections that may be empty are much more common than collections that contain null values.

Also note that the special type parameter of ContainerWithFirstElement would then have to satisfy Finished instead of Nothing. Not sure if that's a problem.

Another option would be to go the Java route and have any attempt to access elements outside of the legal range throw exception.

Oh come on, seriously? ;-) That sounds like something we'd rather want to avoid in Ceylon.

quintesse commented 11 years ago

But you can't use if(exists) to check for Finished

No, but Finished is probably not a good idea anyway because it is used to signal the end of a list, but we could introduce a new type to represent something does not exist but not because it's optional, but because of some other reason and if (exists) would take that into account. But honestly that doesn't really seem like the way to go.

It's one of the things Ross warned about when he suggested that we should use the more FP-like way of handling optionality, with Some/None being actual container classes, like in Scala with Maybe. That's not how we will ever do it, but it means having to deal with ambiguous meanings for null.

Oh come on, seriously? ;-)

Well, semi-seriously. I must say I don't like it much that I have to check for null when I know my code will never ever generate an illegal index. But I'll survive ;)

quintesse commented 11 years ago

Oh come on, seriously? ;-)

NB: the first helper function I wrote for myself has been T notNull(T?) because I got tired of doing if (exists) checks on things I knew couldn't ever be null. And I'm pretty sure that once I start using arrays I'll add an unsafeItem() to do the same.

The reasoning behind this is that if I have to write an if (exists) on something I know is never going to be null I'll probably leave out the else or leave it like:

    if (exists foo) {
        ...
    } else {
        // Should never happen!
    }

which is worse. So the best thing to do is to throw some kind of Exception.... which is exactly what notNull() does.

ikasiuk commented 11 years ago

Strange, I've never felt the need for a notNull function so far. When I know for sure that a value cannot be null then I almost always have it as a non-optional type anyway.

chochos commented 11 years ago

I think @quintesse is referring to a different case: imagine you have a Sequence with 10 items. You created it, it's private, nobody's modified it, so you know it has 10 items. Still, when you get seq[5], you have to check for null.

quintesse commented 11 years ago

Exactly, or in my case, implementing a Java interface with a method with 3 parameters. Because it comes from Java all of them are considered optional, but I know the code will never be called with null, but I still have to do:

if (exists param1) {
    if (exists param2) {
        if (exists param 3) {
            // Do something here and exit
        }
    }
}
// Error handling?
// You know people will just ignore it if they can
quintesse commented 11 years ago

To me the above code is not nice at all, it's too deeply nested and it just doesn't "flow" right. I'd rather write it like this:

if (!exists param1) {
    // Handle error and exit
}

if (!exists param2) {
    // Handle error and exit
}

if (!exists param3) {
    // Handle error and exit
}

// Do the actual important stuff here and exit

But of course we can't do that in Ceylon (yet?), worse yet we can't even combine them into one single if (yet), so that's why I came up with this, admittedly still ugly solution:

value param1copy = notNull(param1);
value param2copy = notNull(param2);
value param3copy = notNull(param3);

// Do the actual important stuff here and exit
quintesse commented 11 years ago

PS: just for completeness sake, when I asked Gavin if we'd ever allow:

Foo? foo = ...
if (!exists foo) {
    // return or throw
}
foo.bar(); 

he said it wouldn't be impossible, but he made the observation that it could be seen as very strange that within the same scope a variable could suddenly change type (and he's not wrong of course).

ikasiuk commented 11 years ago

I think @quintesse is referring to a different case: imagine you have a Sequence with 10 items. You created it, it's private, nobody's modified it, so you know it has 10 items. Still, when you get seq[5], you have to check for null.

True. Not sure if that's really a problem though. Maybe in Ceylon you'd rather use sub ranges to access such a sequence, and then you don't need null checks.

Exactly, or in my case, implementing a Java interface with a method with 3 parameters.

Ah, I see. That's indeed a little inconvenient. For return values from Java functions we are allowed to write

String s = javaFunction();

And an automatic null check is inserted so that you get a NPE if the string is null. Maybe we should introduce the same rule for implementing Java interfaces:

class C() implements JavaInterface {
    shared actual void f(String s) {
        // automatic null check for s
    }
}

It's the same situation after all: we don't know if the parameter was meant to be optional because that can't be expressed in Java.

To me the above code is not nice at all, it's too deeply nested and it just doesn't "flow" right.

Yes, I agree. I think eventually at least the following should be possible:

if (exists x && exists y && exists z) {
    // can access x, y and z here
}

See also ceylon/ceylon-spec#170.

RossTate commented 11 years ago

@ikasiuk, you've only hit upon the surface of the problems. Consider the following:

T? max<T>(T[] ts, Boolean lessThan(T,T)) {
  variable T? max := null;
  for (T t : ts) {
    if (exists max) {
      if (lessThan(max, t)) {
        max := t;
      }
    } else {
      max := t;
    }
  }
  return max;
}

This looks fine, but say you were some fancy programmer and decided to encode the integers with infinity by using null. Then you'd expect the following

Boolean lessThan(Integer? left, Integer? right) {
  if (exists left) {
    if (exists right) {
      return left < right;
    } else {
      return true;
    }
  } else {
    return false;
}
Integer result = max([0, null, 5], lessThan);

to result in null. However, since the implementation of max uses ? and exists internally, it will actually result in 5.

Because of these problems, I've occasionally wondered if we should make T? not a true type. That is, you couldn't use it 9as a type argument for a generic method or a generic class/interface. Rather we call it a branching/control meta-type, since it's really just meant to indicate two cases that the programmer might want to branch on. We could have other meta-types as well, like maybe tuples. I'm just curious what y'all think.

ikasiuk commented 11 years ago

but say you were some fancy programmer and decided to encode the integers with infinity by using null

I would say that's clearly a programming error. null already represents "not an object". So if you use it to also represent something else then you shouldn't be surprised when that results in problems.

In a way that's also the case in collections, where we use null to represent "no element at this position" in addition to the usual "not an object". But of course these two concepts are much more similar to each other. So it's perhaps not unreasonable to represent them both by null although that does give rise to certain problems.

Maybe it would be possible to modify the definition of null so that such problems become more unlikely. On the other hand I kind of like the fact that the type of null is a normal type. And I'm relatively sure that even with a modified definition people would still find ways to abuse null somehow :-)

RossTate commented 11 years ago

Heh, I can't tell whether you approve or disapprove.

I would say that's clearly a programming error. null already represents "not an object". So if you use it to also represent something else then you shouldn't be surprised when that results in problems.

This is an argument against my specific example. However, the bigger problem is that you can have two pieces of code that appear completely bug-free when viewed separately but which behave erroneously when put together. That is a big problem for modularity.

chochos commented 11 years ago

I know you're way beyond this now, but... with Lists I would simply use exists list.lastIndex to know if it's empty or not, rather than exists list.first or exists list.last. There's also Iterable.empty which should be optimized in finite collections to avoid creating a new Iterator on each call. That removes the ambiguity of whether list.first or list.last return null because the List is empty or because the first or last elements are null; especially when you have a List (or Sequence or whatever) of an optional type.

ikasiuk commented 11 years ago

@chochos Of course you can safely check if an Iterable is empty using empty, but that's not what I meant. I was referring to something like this:

if (exists first = lst.first) {
    // process first element
    for (elem in lst.rest) {
        // process remaining elements
    }
}
else {
    // list is empty
}

This is a very subtle error that's not easy to spot if you don't know what to look for.

chochos commented 11 years ago

Ah well in that case you are trying to skip a step. You want to get the first element while determining if the list is empty. For a List<Integer> for example your code would work fine. But you shouldn't do that with a List<Integer?> - if the list can contain nulls then you should use empty or check lastIndex to determine if it's empty, even if you're not going to use the value:

if (exists lst.lastIndex) {
    if (exists first = lst.first) {
        //process first element
    }
    for (elem in lst.rest) {
        //process remaining
    }
} else { /* list is empty */ }

It will be a common issue with optional types, I know. Especially with collections coming from Java. But I believe as the language becomes more widely used, this will be one of the first things other devs will tell you about...

RossTate commented 11 years ago

Shoot, I'd forgotten about dealing with collections from Java code. That messes up my proposal =^(

There is still a way to prevent these kinds of ambiguities, though. In particular, they arise when exists is used on a T? where T itself could (stand for a type that) contains a null value. So, rather than restricting the types, one could restrict the uses of exists to unambiguous situations.

ikasiuk commented 11 years ago

@chochos Yes, of course I know how to write it correctly. The point is: with the second solution mentioned in the original issue description that code would actually be correct because the ambiguity issue wouldn't exist. On the other hand it would make Java interop more complicated for collections. That's the question I originally tried to ask with this issue: which solution is preferable?

@RossTate

However, the bigger problem is that you can have two pieces of code that appear completely bug-free when viewed separately but which behave erroneously when put together.

IMO lessThan is not bug-free even when viewed completely separately because it makes incorrect use of null.

RossTate commented 11 years ago

IMO lessThan is not bug-free even when viewed completely separately because it makes incorrect use of null.

Okay, but this argument doesn't help me figure out how to improve the language. So far this is sounding like "that's a bug because it's not using the language features like how I intended them to be used". The problem is, without some formalization of how the feature's intended to be used, I (and any programmer) have to guess what you are thinking to determine if some piece of code is buggy or not. From various mathematical perspectives that I can think of, some say Ceylon is buggy, others say just max is buggy, and some extreme ones say lessThan is also buggy, but I can't think of one that says lessThan is buggy but max is valid. So, to try to understand your intentions, what about this guess at an example:

We use the same max from earlier. Now, say for Java compatibility reasons, you need to implement a maxUnlessNull function that takes a list of values, none of which you expect are null but you can't guarantee, so you return null if any of the values are null and otherwise returns the max of all the values. This sounds reasonable to me, and in fact has some common mathematical interpretations.

Being a good software engineer, you want to reuse code rather than write from scratch, so you realize that you can use max to do your job for you:

T? maxUnlessNull(T?[] ts, Boolean lessThan(T,T)) {
  Boolean lessThanWithNull(T? left, T? right) {
    if (exists left) {
      if (exists right) {
        return lessThan(left, right);
      } else {
        return true;
      }
    } else {
      return false;
    }
  };
  return max(ts, lessThanWithNull);
}

Unfortunately, little do you know, because of how max happens to be implemented your code is buggy.

So, is this an acceptable example? If not, can you explain why not somewhat formally?

ikasiuk commented 11 years ago

There is still a way to prevent these kinds of ambiguities, though. In particular, they arise when exists is used on a T? where T itself could (stand for a type that) contains a null value. So, rather than restricting the types, one could restrict the uses of exists to unambiguous situations.

But how do you deal with such values then, how can you ever access them?

So far this is sounding like "that's a bug because it's not using the language features like how I intended them to be used".

Yes that's true, and I agree that it would be nicer if the language could better guide you towards a correct implementation. But I'm not sure if there is a solution that achieves that without sacrificing too much flexibility or adding too much complexity. Can you give an example for how your meta-type approach would work? You said that "you couldn't use it as a type argument for a generic method or a generic class/interface." What exactly does that mean (a piece of exemplifying code would be nice)?

So, is this an acceptable example?

Yes, I think that example is better. So where is the problem in the code? Looking at max again there's something curious: it seems to assume that after max:=t the value of max will not be null. Otherwise it wouldn't be valid to use if (exists) to check if max has already been assigned. But that assumption is not guaranteed, so the implementation is at least not completely robust (i.e. it doesn't work for some possible input). I think if the code inside the function is written assuming that values of type T will not be null then you should add the constraint given T satisfies Object to the method. And if you don't want to add that constraint then you have to implement the method in a way that correctly handles the null values.

Admittedly, the compiler does not keep you from getting this wrong. But is there a reasonable way to achieve that?

Here is the correct implementation of max, assuming you want to allow null values in ts:

T? max<T>(T[] ts, Boolean lessThan(T t1, T t2)) {
    if (nonempty ts) {
        variable T max := ts.first;
        for (T t in ts.rest) {
            if (lessThan(max, t)) { max := t; }
        }
        return max;
    }
    return null;
}

And for completeness, here's the same thing with the alternative solution:

T? max<T>(T[] ts, Boolean lessThan(T t1, T t2)) 
        given T satisfies Object {
  if (nonempty ts) {
      variable T max := ts.first;
      for (T t in ts.rest) {
          if (lessThan(max, t)) { max := t; }
      }
      return max;
  }
  return null;
}
Boolean lessThan(Integer|NullElement left, Integer|NullElement right) {
    if (is Integer left) {
        if (is Integer right) {
            return left < right;
        }
        return true;
    }
    return false;
}
Integer|NullElement? result1 = max({0, nullElement, 5}, lessThan);
Integer? result2 = max<Integer>({0, 5}, lessThan);

They are mostly equivalent, expect the explicit difference between null (if the list is empty) and nullValue (if the list contains a nullValue) in the result of max.

RossTate commented 11 years ago

First, regarding my meta-type thing, I realized it doesn't work because of interoperability with Java. In particular, it disallowed types such as j.u.List<Integer?> due to the nested Integer?, but we need such types for working with Java. For the same reason, your use of NullElement above doesn't meet the usage requirements I had proposed for maxUnlessNull, but it sounds like that isn't a big deal regarding the discussion.

So, as you note, the bug in my code is in max. In particular, it is because we are using exists on a T? where T can itself contain null. As you say, the compiler gives no warning about this, so I imagine such bugs will arise a lot and confuse a lot of people. Unfortunately, requiring given T satisfies Object violates the intended application of maxUnlessNull, which is to handle data structures coming from Java code. Also, while your rewriting of max works for this example, it is not a general-purpose solution. Rather we need some Option or Maybe type that people can use in this situation.

So, I think two changes are in order: having the compiler issue some warning about these glitchy uses of exists, and adding an Option or Maybe type. What do y'all think?

ikasiuk commented 11 years ago

For the same reason, your use of NullElement above doesn't meet the usage requirements I had proposed for maxUnlessNull, but it sounds like that isn't a big deal regarding the discussion.

The idea was to use nullElement only in Ceylon collections, and java.util.List is not a Ceylon collection (you can't use it as input to your max method because it's not a T[]). Array would be interesting because it is mapped to a Java array. That's why I said that the Array implementation would have to map between null (on Java side) and nullElement (on Ceylon side).

having the compiler issue some warning about these glitchy uses of exists

Perhaps that could indeed be useful. I'm not sure if there would be too many false positives, but maybe it's worth a try.

and adding an Option or Maybe type

Sorry, but I definitely don't want that. I do see the theoretical use cases. But having two general-purpose concepts for optionality, two "levels" of optionality, would simply be bad for the language. These core concepts of the language have to remain very simple. I'd rather accept some little ambiguities in certain cases than sacrifice simplicity.

FroMage commented 11 years ago

I'm strongly in favour of allowing null elements in collections (and in Entry for that matter), and checking empty to determine size rather than checking if first or last is null.

ikasiuk commented 11 years ago

I'm strongly in favour of allowing null elements in collections (and in Entry for that matter), and checking empty to determine size rather than checking if first or last is null.

Yes, I think I also prefer to leave it as it is. This nullElement thing is just not useful enough to be worth the effort. Not sure about Entry though. It's mainly used for maps, and I think it's a good thing we don't have nulls in maps. It's a nuisance if map[key] returns null and you nevertheless additionally have to check if the map contains the key. And more importantly, it's mostly useless: if you want to map a key to "nothing" then you should just remove the mapping.

having the compiler issue some warning about these glitchy uses of exists

Perhaps that could indeed be useful. I'm not sure if there would be too many false positives, but maybe it's worth a try.

Actually, I have to contradict myself there: issuing a warning only makes sense if there is a possibility to write the code in a more correct way. But how should you access a value of type T? without using if(exists)? So I guess we can't warn about that, simply because there's no better choice.

FroMage commented 11 years ago

it's mostly useless: if you want to map a key to "nothing" then you should just remove the mapping

Wrong. If you'd done any REST APIs (to cite only one example) you'd know that passing a PATCH with {'foo': 1; 'bar': null;} will set both foo and bar properties (and only those) while {'foo': 1;} will only set foo and leave the bar property intact. There's a huge difference in semantics between a missing key and a key whose value is explicitly null. Only fools don't check for the presence of a key in a map. In fact, I'm pretty sure the Java Map even allows a null key.

gavinking commented 11 years ago

And more importantly, it's mostly useless: if you want to map a key to "nothing" then you should just remove the mapping.

I totally agree with this.

Wrong. If you'd done any REST APIs (to cite only one example) you'd know that passing a PATCH with {'foo': 1; 'bar': null;} will set both foo and bar properties (and only those) while {'foo': 1;} will only set foo and leave the bar property intact.

And the only possible way to represent a diff of a Map<String,String|Number> is using a Map<String,String|Number|Nothing>? I don't buy this at all. Why can't you use a Map<String,Patch<String|Number>> or even just a Map<String,String|Number|Default>? You're conflating the serialized form—json, I assume—with its reification in Ceylon, a language with a much less limited type system.

In fact, I'm pretty sure the Java Map even allows a null key.

Depends upon the Map implementation, IIRC. Indeed, this is more or less a bug, from my way of looking at it.

FroMage commented 11 years ago

Regardless of what you buy or not, a key with a null value has different semantics to the absence of a key. Allowing null values is more expressive. We can restrict that, but I really don't see why we'd even want to go that way.

gavinking commented 11 years ago

Regardless of what you buy or not, a key with a null value has different semantics to the absence of a key.

Wrong. You just don't know that a priori. It is part of the definition of the Map API whether this is true or not. You're used to it being true in Java but it's not true in Ceylon.

We can restrict that, but I really don't see why we'd even want to go that way.

Actually that's been the well-defined semantics since the very earliest sketch of a Map API. It's expressed in the type constraints on Map and on Entry.

FroMage commented 11 years ago

I know, and I still find it as weird now as I did in the beginning. Remind me why we put that restriction?

gavinking commented 11 years ago

Remind me why we put that restriction?

So that == is well-defined for Entrys.

FroMage commented 11 years ago

Oh without that eq function that everyone wants, you mean? ;)

gavinking commented 11 years ago

Oh without that eq function that everyone wants, you mean? ;)

Right, without introducing functions with totally bogus semantics :-)

FroMage commented 11 years ago

It's a pity that you don't acknowledge the usefulness of that function, because I predict it's going to be the single most written function in Ceylon ;)

ikasiuk commented 11 years ago

There may be reasonable use cases for null values in maps, but probably nothing that can't be easily expressed by other means in Ceylon. I don't find it weird at all that nulls are not allowed, though that may be personal preference.

So that == is well-defined for Entrys.

Actually, if we allowed nulls in maps then that would induce a definition for equality for entries with null items. You can access the values of a map as a collection using the values attribute. We do already allow collections with null values, and because we have to implement equals for such collections we define that {null}=={null} is true. To remain consistent that would mean that (x->null)==(x->null) must also be true.

I'm still not suggesting to do that. But it's interesting to see the implications of nulls in collections.

It's a pity that you don't acknowledge the usefulness of that function, because I predict it's going to be the single most written function in Ceylon ;)

Hehe yeah, that is an interesting topic. The problem with the eq function is that it doesn't really make sense in any case: either null is equatable or it isn't. If it is then we'd have to move equals and hash to the top of the type hierarchy (because everything except null already supports ==), so eq wouldn't be needed. And if null isn't equatable then there is no valid definition of eq.

On the one hand I think it's elegant and consistent that you can't compare things to null. It may actually help to write more correct code. On the other hand it's hard to deny that the question does sometimes arise whether null is equal to something or not. And although strictly speaking that's not a correct solution, people tend to use programming languages in a pragmatic way and will likely come up with something that answers this question with a Boolean.

I would really like to know how we'll think about this in a few years. I don't know if eq will really be the single most written function. But if it turns out that a significant amount of users does write and use such a function then I would say our language design was not ideal in that respect, simply because it failed to meet an everyday requirement.

By the way: Scala uses the solution where everything supports ==, i.e. equals is defined in the top of the type hierarchy. Does anyone know if there are typical problems with that? Hm, of course null works differently in Scala anyway so maybe that comparison doesn't make so much sense.

FroMage commented 11 years ago

Just a quick example of the kind of code that we write when there's no eq function that groks null values:


    shared actual default Boolean equals(Object that) {
        if (is List<Void> that) {
            if (that.size==size) {
                for (i in 0..size-1) {
                    value x = this[i];
                    value y = that[i];
                    // Here's a bogus eq impl
                    if (exists x) {
                        if (exists y) {
                            if (x!=y) {
                                return false;
                            }
                        }
                        else {
                            return false;
                        }
                    }
                    else {
                        return !exists y; // Wait, WAT?
                    }
                }
                else {
                    return true;
                }
            }
        }
        return false;
    }

I'm pretty sure this stops at the first null entry in the first List rather than go on if the same-index entry on the second List is also null. Yes that comes from ceylon.language.List, I didn't have to look far. I'm pretty sure we've more than one implementation of eq already in this module and in the SDK.

gavinking commented 11 years ago

Today I can write that function like this:

shared actual default Boolean equals(Object that) {
    if (is List<Void> that) {
        return size==that.size && 
            every(for (a->b in zip(indexed, that.indexed)) a==b);
    }
    else {
        return false;
    }
}

In future—once we get some more sophisticated ways of handling exists inside expressions, there will be even more compact/efficient ways to write it. The only reason we write it out like that in the language module is that it is supposed to be indicative of a very efficient implementation. (Of course, that code is never actually executed or tested anywhere.)

FroMage commented 11 years ago

Today I can write that function like this

No you can't since Entry doesn't accept nulls, or am I missing something?

gavinking commented 11 years ago

i.e. The point is that equality of Lists is not defined in terms of every index being "equal" according to some hypothetical-nonsense eq() function. It is defined to mean that the two Lists, viewed as sets of entries, have the same sets of entries (just like for Maps). The implementation obscures that definition because of efficiency concerns.

quintesse commented 11 years ago

BTW: why indexed? Or is that exactly to prevent the nulls Stef talks about?

gavinking commented 11 years ago

No you can't since Entry doesn't accept nulls, or am I missing something?

Check the definition of indexed. Making all this stuff actually meaningful depends on viewing a List as a set of entries. Just like for a Map, there is no concept of a "null" entry. So just like for a Map, a null value at an index means no entry at that index.

This is the only way I was able to rationalize the idea of allowing the == operation on Lists and Maps. But actually it's a totally reasonable and internally consistent view of the world.

FroMage commented 11 years ago

OK, right. That works. But note that it would keep working if we decided that Entry could contain null and that null == null ;)

chochos commented 11 years ago

what if Nothing refines Boolean equals(Object? other) { return !exists other; }? then you can do == as long as you have an optional type on the LHS (the RHS can have an optional or non-optional type)

ikasiuk commented 11 years ago

@gavinking The implementation of indexed in Iterable.ceylon is

shared default Iterable<Entry<Integer,Element&Object>> indexed {
    variable value i:=0;
    return elements { for (e in this) if (exists e) i++->e };
}

That only seems to increment i for non-null elements. Is that correct?

gavinking commented 11 years ago

@FroMage. This would also compile:

value workmates = { for (p in people) for (o in org) if (p.workAddress==o.address) p->o };

But of course it's almost certainly not doing what it is supposed to do. Because == is meaningless for null.

gavinking commented 11 years ago

That only seems to increment i for non-null elements. Is that correct?

No it was broken but I already fixed it. See @e621030.

FroMage commented 11 years ago

Maybe it's just me, but of all the problems that null can cause, the fact that two nulls of different type would be == is not one I've ever come across. Not once. Now if we compare that to the number of times I've reached for eq… ;)

gavinking commented 11 years ago

the fact that two nulls of different type

In my example the two nulls are of the same type and the same semantics. They both represent a work address. The problem is nothing to do with typing. It is to do with that fact that it is simply not right to say that null==null is true. If a person is unemployed, and an organization has no known address or no physical address, that does absolutely not fucking mean that that person works at that organization!

I mean, essentially the only time in Java when it's not a bug to do "null==null" is when one of the nulls is a literal null. Comparing two "maybe null"s is essentially always a bug, except in some very special circumstances like writing collections libraries.

gavinking commented 11 years ago

I mean, this is the whole reason SQL has ternary logic. Because they understand this stuff slightly better than most programmer-type folks. (Ternary logic has its own problems, of course.)

FroMage commented 11 years ago

If a person is unemployed, and an organization has no known address or no physical address, that does absolutely not fucking mean that that person works at that organization!

That represents a logic bug then, and I don't think it's related to the language at all or to whether two null values should be equal.

Comparing two "maybe null"s is essentially always a bug

I disagree completely. Any implementation of equals that has optional attributes will end up doing just that.