Test CheckAgainstNegativeOne fails when string index annotations enabled

panacekcz commented 7 years ago

When index annotations on String are enabled, the test CheckAgainstNegativeOne from the index suite fails.

class CheckAgainstNegativeOne {

    public static String replaceString(String target, String oldStr, String newStr) {
        if (oldStr.equals("")) {
            throw new IllegalArgumentException();
        }

        StringBuffer result = new StringBuffer();
        int lastend = 0; // invariant: 0<=lastend && lastend<=target.length()
        int pos; // invariant: -1<=pos && pos+oldStr.length()<=target.length()
        while ((pos = target.indexOf(oldStr, lastend)) != -1) {
            result.append(target.substring(lastend, pos));
            result.append(newStr);
            lastend = pos + oldStr.length();
        }
        result.append(target.substring(lastend));
        return result.toString();
    }
}

The test would pass if:

lastend is annotated @IndexOrHigh("target")
subtyping of upper bounds is extended so that @LTLengthOf(value="target", offset="oldStr.length()-1") is a subtype of @LTLengthOf(value="target", offset="-1"), when oldStr.length() is known to be @NonNegative
both parameters of String.substring(int beginIndex,int endIndex) are annotated @IndexOrHigh("this")
the return value of String.indexOf is annotated @NonNegative @LTLengthOf(value="this",offset="#1.length()-1")

panacekcz commented 7 years ago

A simpler solution that does not require changes to the subtyping relation:

annotate lastend as @IndexOrHigh("target")
annotate return value of indexOf as @LTLengthOf(value={"this", "this"}, offset={"-1", "#1.length()-1"})

However, this postcondition is not sound, because it states, that pos+str.length()-1<target.length(). This may not hold when the string is not found. For example when str=="01234" and target=="", then target.indexOf(str)+str.length()-1 is -1+5-1 = 3 which is not lower than 0.

panacekcz commented 7 years ago

Version using both offsets is in the bounded-strings-indexof branch (https://github.com/panacekcz/checker-framework/commit/684557233c8074d857a152475b4dd6de5e451207)

panacekcz commented 7 years ago

The core of this issue is to support this pattern in the upper bound checker:

int i = receiver.indexOf(argument);
if(i != -1){
 @LTEqLengthOf("receiver") int j = i + argument.length();
}

This code should not cause any warnings. However, if the check against -1 is missing, the upper bound checker should report a warning, because if argument is longer than receiver, j might end up being greater than the length of receiver.

A sound upper-bound annotation for the return value of indexOf would be @LTLengthOf(value={"this", "this"}, offset={"-1", "min(#1.length()-1,this.length())"}), which correctly handles the case where the argument is longer than the receiver. However, this will not help solving this issue, because it does not express the fact that if i!=-1, then argument is not longer than receiver. Unless the checker knows this information, it cannot refine the offset min(#1.length()-1,this.length()) to the required #1.length()-1.

This leads me to conclusion that some kind of new annotation is necessary. Possible solutions:

Variant 1: Extending the upper bound hierarchy

The idea is to take LTLengthOf and make a relaxed version LTLengthOfOrNegativeOne which additionally allows -1. indexOf would return LTLengthOfOrNegativeOne(value={"this", "this"}, offset={"-1", "#1.length()-1"}) This annotation would be in most cases treated the same as the corresponding LTLengthOf annotation ignoring all value-offset pairs with non-zero offsets. The only difference would be one of those:

Variant 1a: in UpperBoundTransfer.visitNumericalAddition, it would query the lower bound checker (similarly to how visitNumericalSubtraction does) and if it is non-negative, it would treat it like LTLengthOf with all the value-offset pairs.
Variant 1b: on comparison !=-1, it would refine the annotation in the then branch to LTLengthOf with all the value-offset pairs.

Variant 2: A new special annotation

Add an annotation specifically for this purpose, such as @IndexOf(value="argument", in="receiver"). The upper bound checker would query this annotation and infer upper bound annoations as appropriate. This has the advantage of having a clear meaning and not complicating the upper bound hierarchy.

Variant 2a: put it in the search index hierarchy. The name of the hierarchy is quite fitting and the concept is similar. The methods binarySearch and indexOf are not often used together, so these two purposes shouldn't interfere.
Variant 2b: like 2a, but make the new annotation more consistent with the existing index checker annotations, by using value and offset parameters with the usual meaning. For example, @IndexOfIndexFor(value="receiver", offset="argument.length()"). Then @IndexOfIndexFor(value="receiver", offset="-1") would be a subtype of @SearchIndexFor(value="receiver").
Variant 2c: create a new hierarchy just for this annotation.

kelloggm commented 7 years ago

I'll leave my thoughts on each of the variants you suggested here.

Variant 1: The meaning of your proposed annotation (as you define it and as I would understand from its name) is actually not different from the meaning of LTLengthOf: LTLengthOf can definitely mean -1. An variable i with type @LTLengthOf("a") has only one invariant: i < a.length (or a.length() for Strings). Trivially, substituting -1 for i makes that invariant true. It's clear to me that there's some additional information being carried by your proposed annotation - the knowledge that checking against -1 allows us to conclude that the calls to indexOf are safe. I think this information is separate from the Upper Bound hierarchy, so I do not favor this option.

Variant 2: This seems like a more reasonable plan. I do not think that putting it in the Search Index hierarchy is the right move, though - that will cause headaches down the road when we encounter a case where indexOf and binarySearch are used together. Further, though the concepts are similar they're not identical. I wouldn't want to work through what should be a subtype of what, and I imagine you wouldn't either. If we were to take this option, I'd prefer variant 2c: create a specialized, single-purpose checker, like we did for binarySearch.

However, I'm not convinced that this is worth pursuing. We implemented the Search Index Checker because binary search is used in a lot of array-using code. I know that indexOf is used a lot in conjunction with Strings, but I wonder how common this particular case really is. Binary search is an easy and compelling use case to explain for its own specialized checker, and this really isn't. I'm actually comfortable just leaving this as a known false positive and not addressing it, unless it's more common in real code than I currently believe.

panacekcz commented 7 years ago

The difference between LTLengthOfOrNegativeOne and LTLengthOf is only for annotations with some non-zero offset. Otherwise, they are exactly the same. An annotation with a non-zero offset such as LTLengthOf(value="a", offset="b.length+1") may disallow the -1 value. In that case, LTLengthOfOrNegativeOne simply adds it back in. Variant 1 is quite general, but it would complicate the upper bound hierarchy, making it harder to understand and possibly harder to extend in other directions. I would probably prefer variant 2a. Anyway, I will leave this as it is now. This issue is linked from the CheckAgainstNegativeOne test.

kelloggm commented 7 years ago

I think that's a good plan.

panacekcz commented 6 years ago

Fixed by typetools#1461.

panacekcz / checker-framework

Test CheckAgainstNegativeOne fails when string index annotations enabled #4

Variant 1: Extending the upper bound hierarchy

Variant 2: A new special annotation