BaseXdb / basex

BaseX Main Repository.
http://basex.org
BSD 3-Clause "New" or "Revised" License
678 stars 268 forks source link

Prevent fn:error from affecting tail call optimization #2160

Closed GuntherRademacher closed 10 months ago

GuntherRademacher commented 1 year ago

The execution of a recursive function failed unexpectedly with

    Stack Overflow: Try tail recursion? 

A minimal example of a similar function would be this:

declare function local:sum($result, $values) as xs:integer {
  if (empty($values)) then
    $result
  else
    let $head := head($values)
    return
      if ($head instance of xs:integer) then
        local:sum($result + $head, tail($values))
      else
        fn:error()
};
local:sum(0, 1 to 100000)

The failure is unexpected, because the recursive invocation is in a tail position.

This is caused by fn:error having a static type of item()*, and that propagates to the surrounding If and GFLWOR operators. Thus a Typecheck, caused by the explicit function type, is not seen as redundant, and it prevents tail call optimization.

This can of course be easily avoided by changing the query, e.g. by removing the explicit function type, or by casting fn:error to the function result type. However fn:error should not affect static typing of surrounding expressions in this way, as its result is effectively never worked with.

BaseX assigns a static type of item()* to fn:error. Yet per XPath and XQuery Functions and Operators 3.1, it has a special result type none.

This fix adds a new type none, assigns is as the result of fn:error and prevents it from affecting the static types of expressions in SeqType.union and SeqType.intersect.

ChristianGruen commented 1 year ago

…and thanks again.

I was aware of our fishy item()* assignment to fn:error, and I thought much more code would need to be rewritten to introduce a none type. Even better if that’d be pretty much it! I hope I’ll be able to look at this in more detail beginning of next week.

I saw you modified FnIterateWhile.java. Do you think that’s the only function implementation that requires a fix, or is it rather an example how to support the new type?

GuntherRademacher commented 1 year ago

The reason for the change in FnIterateWhile was this test in Fn4ModuleTest looping forever without it:

  @Test public void iterateWhile() {
    final Function func = ITERATE_WHILE;
    query(func.args(1, " not#1", " ->($_) { error() }"), 1);

My initial assumption was that the implementation of the type system itself would provide means to calculate the types of functions and expressions, such as there is in SeqType.union and SeqType.intersect, and in implementations of the Type interface. So the code in FnIterateWhile came to me as a surprise.

I am not aware of any other code that needs special handling for none, but I must admit that I am lacking ideas of how to find out. Luckily there was this test for fn:iterate-while using fn:error, so this one could be found. There may be others - you are in a better position to know them.

Sorry for not having a better answer to your question.

ChristianGruen commented 1 year ago

Sorry for not having a better answer to your question.

Just fine of course, thanks for the assessment.

ChristianGruen commented 1 year ago

I need to fully understand the implications of introducing the new type, so I’ll keep the pull request further open (might take a while).

ChristianGruen commented 10 months ago

I’d probably need 2 or 3 more days to get my head around this, so I decided to close it. Luckily, it will always be possible to look this up in the git history if required…