BaseXdb / basex

BaseX Main Repository.
http://basex.org
BSD 3-Clause "New" or "Revised" License
685 stars 265 forks source link

Avoid long value overflow in position range #2156

Closed GuntherRademacher closed 1 year ago

GuntherRademacher commented 1 year ago

An XQuery positional filter for the last one or two elements of a sequence,

    <a/>[position() >= last() - 1]

failed with this result, when there is just a single item in the context:

    [FOAR0002] Value out of range: 9223372036854775807.

This is caused by Pos.get transforming the filter predicate to a range ending in Long.MAX_VALUE,

        case GE:
          minMax = new Expr[] { ps, Int.MAX };
          break;
        case GT:
          minMax = new Expr[] { new Arith(ii, type.instanceOf(AtomType.INTEGER) ? ps :
            cc.function(Function.FLOOR, ii, ps), Int.ONE, Calc.PLUS).optimize(cc), Int.MAX };
          break;

and Range.value producing a long overflow while calculating the range size,

    final long size = mx - mn + 1;
    if(size > 0) return RangeSeq.get(mn, size, true);
    // overflow of long value
    throw RANGE_X.get(info, mx);

where mn == 0 and mx == Long.MAX_VALUE.

Actually the lower bound of the position range never needs to be less than 1, so the idea of this fix is to rewrite

    <a/>[position() >= last() - 1]

to

    <a/>[position() >= max((1, last() - 1)]
ChristianGruen commented 1 year ago

An excellent analysis! And a reasonable proposal to resolve the problem.

I noticed there are various other cases with minimum/maximum long values that cause questionable results or are rejected, e.g.:

(: [FOAR0002] Value out of range: 9223372036854775807. :)
<a/>[position() = 0 to 9223372036854775807]

…so I decided to change the general behavior of Range.value: If an over-/underflow occurs, I’m now passing on the maximum long value instead of raising an error.

Admittedly, the solution introduces new errors. For example, the following query had been rejected, whereas it now returns a wrong result (9223372036854775807 instead of 9223372036854775808):

count(0 to 9223372036854775807)

There would be several ways to fix this (use BigInteger instances for large integers; use flags/constants to indicate infinite values; etc.), but I don’t think it’s worth fixing that. The specification of the language is pretty vague when it comes to handling edge cases, and there appears to be no XQuery implementation that yields 100% correct results for such cases.

GuntherRademacher commented 1 year ago

Thanks a lot!