BaseXdb / basex

BaseX Main Repository.
http://basex.org
BSD 3-Clause "New" or "Revised" License
678 stars 268 forks source link

XQuery: sort order in Window function #2119

Closed mikekuznetsov11 closed 2 years ago

mikekuznetsov11 commented 2 years ago

Description of the Problem

Under certain circumstances Window function applies ordering where it should not.

Expected Behavior

The resulting sequence must be unordered

Steps to Reproduce the Behavior

BaseX_window sorting defect.txt In the example above I constructed a cascade:

When I return the whole window $w it gives:

However when I return $w/d it gives:

Do you have an idea how to solve the issue?

No response

What is your configuration?

BaseX version 9.7.1 installed on Windows 10 Enterprise 1903

General Options

DEBUG = false DBPATH = C:\Program Files (x86)\BaseX\data LOGPATH = .logs REPOPATH = C:\Program Files (x86)\BaseX\repo LANG = English LANGKEYS = false FAIRLOCK = false CACHETIMEOUT = 3600

Client/Server Architecture

HOST = localhost PORT = 1984 SERVERPORT = 1984 USER = PASSWORD = SERVERHOST = PROXYHOST = PROXYPORT = 0 NONPROXYHOSTS = IGNORECERT = false IGNOREHOSTNAME = false TIMEOUT = 30 KEEPALIVE = 600 PARALLEL = 8 LOG = true LOGMSGMAXLEN = 1000 LOGTRACE = true

HTTP Services

WEBPATH = C:\Program Files (x86)\BaseX\webapp GZIP = false RESTPATH = RESTXQPATH = PARSERESTXQ = 3 RESTXQERRORS = true HTTPLOCAL = false STOPPORT = 8985 AUTHMETHOD = Basic

Local Options

ChristianGruen commented 2 years ago

Please help us to make your question more specific. If <window>{$w}</window> in your query is replaced by <window>{$w/d}</window>, the result is:

<outWindow>
  <window>
    <d>1</d>
  </window>
  <window>
    <d>2</d>
    <d>1</d>
  </window>
</outWindow>
<outd>
  <window>
    <d>1</d>
  </window>
  <window>
    <d>2</d>
    <d>1</d>
  </window>
</outd>

Which result would you expect?

ChristianGruen commented 2 years ago

I think I understood that it’s the semantics of the path expression that troubled you. Whenever you evaluate a path, the resulting nodes will be freed from duplicates and put into document order. See the following example …

let $input :=
  <dataset>
    <r>1</r>
    <r>2</r>
  </dataset>
let $reversed := reverse($input/r)
return ($reversed, $reversed/text())

… and its result:

<r>2</r>
<r>1</r>
1
2

Thanks to the path expression, the text nodes of the reversed nodes will be resorted and returned in document order.

You can preserve the original order by processing your sequence items one by one, i.e., by using a FLWOR expression or the simple map operator:

(: Solution 1 :)
for $r in $reversed
return $r/text(),

(: Solution 2 :)
$reversed ! text()
mikekuznetsov11 commented 2 years ago

Christian, thanks very much for the quick response. You hit it exactly right - my expectation was that the "order by" on step 2 should persist in the Window output (step 3). I like your advice in Solution 2 - use (!) instead of (/). This helps a lot!