eXist-db / exist

eXist Native XML Database and Application Platform
https://exist-db.org
GNU Lesser General Public License v2.1
429 stars 179 forks source link

ft:get-field not returning expected results when indexed document contains elements with fulltext index configured #2312

Open tuurma opened 5 years ago

tuurma commented 5 years ago

Problem

For the following collection configuration

<collection xmlns="http://exist-db.org/collection-config/1.0">
    <index xmlns:xs="http://www.w3.org/2001/XMLSchema">
        <lucene>
            <text qname="foo"/>
        </lucene>
    </index>
</collection>

After creating index upon storing the document retrieval of named index field is not returning expected results but only if the document stored contains one of the elements for which lucene index has been configured (<foo> in the example)

declare function local:index($path) {
    let $index :=  <doc><field name="foo-field" store="yes">Foobar index data</field></doc>
    return
       ft:index($path, $index)
};

let $collection := '/db/apps/indextest'
let $doc-name := 'foobar.xml'

let $data := 
    <text>
        <body>
           <foo/>
        </body>
    </text>

let $c := console:log($data)

(: remove resource if exists :)
let $remove := if (doc-available($collection || '/' || $doc-name)) then xmldb:remove($collection, $doc-name) else ()

let $store := xmldb:store($collection, $doc-name, $data)
let $index := local:index($store)

return
    (ft:search($store, 'foo-field:foobar'), ft:get-field($store, 'foo-field'))

Actual result

Only ft:search returns a match

<results>
    <search uri="/db/apps/indextest/foobar.xml" score="6.5257874">
        <field name="foo-field"><exist:match xmlns:exist="http://exist.sourceforge.net/NS/exist">Foobar</exist:match> index data</field>
    </search>
</results>

Expected result

1
<results>
    <search uri="/db/apps/indextest/foobar.xml" score="6.5257864">
        <field name="foo-field"><exist:match xmlns:exist="http://exist.sourceforge.net/NS/exist">Foobar</exist:match> index data</field>
    </search>
</results>
2
Foobar index data

Test and reproduce

One can reproduce as described above, I will prepare a PR with extension for Lucene Tests in a mo.

xatapult commented 5 years ago

We seem to have something alike: All indexes that rely on the underlying Lucene (like also the new range indexes) stay empty after a reindex on 4.5.0.

So it might be the problem is bigger than just full-text indexes.

xatapult commented 5 years ago

With this 4.5.0 became un-usable for us. We're reverting back to 4.4.0.

tuurma commented 5 years ago

I can confirm that after reindexing the collection with xmldb:reindex all the named indexes on that collection are gone but this deserves a separate issue, will create one @eriksiegel