overturetool / overture

The Overture Tool
http://overturetool.org
GNU General Public License v3.0
48 stars 25 forks source link

Inconsistency between equality relation and set membership #731

Closed sifraser closed 4 years ago

sifraser commented 4 years ago

Description

When adding an equality relation to a type we expect that two instances that are 'equal' are considered to be the same instance. However, when adding two 'equal' instances to a set, we get inconsistent behaviour.

Steps to Reproduce

  1. Define a type with an equality relation:

    types
    
    FirstLetter = seq1 of char
    eq s1 = s2 == s1(1) = s2(1);
  2. If we now define a set:
    
    values

flSet: set of FirstLetter = { "Abc", "ABC" }

3. Then the following evaluate to true as expected:

card flSet = 1 "ABC" in set flSet

4. However the following evaluate to false (unexpectedly):

"Abc" in set flSet "Allen" in set flSet

5. Furthermore, maps cannot be used, when we have:

values

m: map FirstLetter to nat = { "Abc" |-> 1, "Bce" |-> 2 }


Then `m("ABC")` gives an exception and `"ABC" in set dom m` is false.

**Expected behavior:** Would expect that equality relation is used in all tests for equality.

**Actual behavior:** Equality relation is used inconsistently when considering members of set.

**Reproduces how often:** 100%

### Versions

Have reproduced on 2.7.4 and 2.6.4.

OS: `Linux 5.3.0-51-generic #44~18.04.2-Ubuntu SMP Thu Apr 23 14:27:18 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux`

### Additional Information

n/a
nickbattle commented 4 years ago

Thank you for reporting and for the detailed explanation.

The first case is happening because a bare string literal, like "Abc", is not a FirstLetter unless it is defined in a context that gives it that type. The inferred type is otherwise a plain seq1 of char. So for example:

> p "Abc" in set flSet
= false

> p let x : FirstLetter = "Abc" in x in set flSet
= true

I thought that the map problem would be the same issue, but it seems something more complicated is happening. For example:

    m: map FirstLetter to nat =
        let a1 : FirstLetter = "Abc", a2 : FirstLetter = "Bce" in
            { a1 |-> 1, a2 |-> 2 }

> p m("ABC")
Error 4061: No such key value in map: "ABC" in 'DEFAULT' (console) at line 1:1

> p let k : FirstLetter = "ABC" in m(k)
Error 4061: No such key value in map: "ABC" in 'DEFAULT' (console) at line 1:32

The first case is valid, I think, because the key is not a FirstLetter, so the equality clause is not being used. But the second case ought to work. I will investigate why...

nickbattle commented 4 years ago

Interesting... this is indeed a bug (with maps). The problem is that the equality relation is used to implement the Java "equals(other)" method, which usually gives the behaviour we want, such as with "in set". But VDM maps are implemented as Java HashMaps and the hashCode() function is not aware of the equality relation - that is, the hash of "Abc" and "ABC" are different, but for the HashMap to work with your invariant type, they have to give the same result.

If I naively set the hashCode to 0, then the map lookup works - but the efficiency of the HashMap will be undermined! There might be other Java maps that only depend on "equals", though generally we ought to implement equals and hashCode for things to work as expected.

I'll give this some thought, but at least we understand the problem. If you have an ideas, I'd be pleased to hear them :-)

nickbattle commented 4 years ago

Incidentally, I didn't need the "let" context to set the map value correctly in my example. Your initializer contains enough type information:

m: map FirstLetter to nat = { "Abc" |-> 1, "Bce" |-> 2 }
sifraser commented 4 years ago

Yes I started with the map issue in a scenario where everything is fully typed and tried to simplify it to something I could report. Obviously, I missed the typing which made me think it was worse than it was :)

Incidentally, we have seen a different instance of this issue where we have a record type that contains a lamda function (which I have been meaning to raise):

types

  TypeWithFunction ::
    index : nat
    fn : nat -> nat;

operations

    TestTypeWithFunction: () ==> bool * nat * bool
    TestTypeWithFunction() ==
         let
            a = mk_TypeWithFunction(1, lambda x: nat & x + 1),
            b = mk_TypeWithFunction(1, lambda x: nat & x + 1),
            c = a = b,  -- true
            d : map TypeWithFunction to nat = { a |-> 3 },
            e = d(a), -- fine
            f = b in set dom d -- true
                        -- , g = d(b) -- breaks
        in
            return mk_(c, e, f);

Here b is in the domain of d but when we apply it we get Error 4061: No such key value in map.

I did have a look at the code previously and saw that the root cause was a equals/hashcode issue, but it looked a little hairy. I will try and have another go and let you know if I can think of anything.

tomooda commented 4 years ago

Use TreeMap if the key is an invariant type with ord? I thought of managing hashCode using some sort of table, but it will anyway require a linear scan for an equal value over many values. So, I think we better rely on hashCode only when eq is not defined. If the key object has the Comparable interface, we can use TreeMap as a second best. If we don't have any ordering, we can do linear scan over arrays or lists. That should be somewhat better than using HashMap with all same hashCode.

sifraser commented 4 years ago

If I naively set the hashCode to 0, then the map lookup works

Maybe default to 0, but have the option to provide a library that generates a hashcode?

My actual use case was for a case insensitive string and I was already using a library function to do the equality check.

nickbattle commented 4 years ago

@tomooda Yes! I just came to the same conclusion. If we use a TreeMap then the map is managed by comparisons. We don't literally have a compareTo unless there is an order clause, but if we have an eq clause, we can produce 0 for equality and then an arbitrary ordering for the remainder (ie. the default).

This seems to work perfectly. I'm still testing it...

nickbattle commented 4 years ago

@sifraser Your second example is something else. This works in VDMJ but not in Overture. It's because of the comparison of the lambdas. It's generally "difficult" to compare functions(!!), but what VDMJ does is to reduce the function to a string: if two function strings are identical, then they are considered equal, else not. Overture (I think) always says that function values are unequal.

I'm still investigating this.

tomooda commented 4 years ago

@nickbattle Maybe we better be careful to use the ord definition. It's user program and the interpreter should not rely on it for its internal consistency.

nickbattle commented 4 years ago

@tomooda Yes, I see your point, though we have said (in the LRM I hope) that if you define orders and equality you must meet various POs regarding their consistency (equivalence relations etc). As long as the user supplied ordering meets these, then I think everything should work; if not, then it may be complicated to work out what's wrong. The difficulty would be that we would need to tell the difference between a "normal" compareTo and one that was being invoked because of an internal Map operation.

The TreeMap solution does seem to work. I suggest we go with this for now as it does make the system better. But if we start to get really complicated problems with badly behave order clauses, we may have to consider not using Java Maps at all, and implementing our own(?).

nickbattle commented 4 years ago

Fix now available in ncb/development.

@sifraser I will raise a different issue with your lambda-map problem, as I think this is a different underlying cause.

sifraser commented 4 years ago

Thanks :+1:

nickbattle commented 4 years ago

@sifraser Actually, scratch that... the fix for the FirstLetter type map does fix the lambda problem as well. It's because the (new) TreeMap directs the code to the compareTo method, which does understand "string" equality of function values; the old HashMap implementation was trying to use the hashCode of function values, which are not giving equal hash values for identical lambdas in Overture (though they are in VDMJ). Try as I might, I can't find a VDM lambda equality that now fails, so I propose to leave this, unless/until we find a case that gives problems.

nickbattle commented 4 years ago

I've changed the InvariantValue to include a hashCode that will return a constant hash if the value has an equality relation defined. This means that if we ever try to use HashXXX based Java classes with such values (as maps used to), they will behave correctly, even if they are very inefficient.