vespa-engine / vespa

AI + Data, online. https://vespa.ai
https://vespa.ai
Apache License 2.0
5.58k stars 586 forks source link

document-summary not including array of complex type #17214

Closed lundin closed 2 years ago

lundin commented 3 years ago

Hi!

On current vespa-cloud (7.381.20) and epel-7 repo i cant add a document-summary with a array of complex type

for example below sd config is used: https://gist.github.com/lundin/6b77971108cf3065def2540fc365c400

is not including the "techattributetree_all" field that was expected when query with: { "presentation": { "summary": "short" }, "yql": "select * from articles where sku matches (\"32222081\");" }

But the field alltecdoc (of type array string) is added. So is sku, simple string, etc. Also matched-element-only does work for a array complextype but not when you want the field added in full to the summary as above.

Removing the summary and select the techattributetree does indeed return the field so the document has it, it is just when used as a field in summary. Also tried struct-field adding as attribute/index etc but did not help. Thanks!

geirst commented 3 years ago

We have reproduced this issue. It occurs when a complex field (e.g. array of struct or map) is used in an explicit document summary with another name than it originally has. It happens for both summary only fields (complex in the example below) and fields that use struct field attributes (_complexattr in the example below).

Assume the given schema:

schema test {
    document test {
        struct my_struct {
            field name type string {}
            field value type string {}
        }
        field complex type array<my_struct> {
            indexing: summary
        }
        field complex_attr type array<my_struct> {
            indexing: summary
            struct-field name { indexing: attribute }
            struct-field value { indexing: attribute }
        }
    }
    document-summary basic {
        from-disk
        summary complex type array<my_struct> { }
        summary complex_attr type array<my_struct> { }
    }
    document-summary rename {
        from-disk
        summary new_complex type array<my_struct> { source: complex }
        summary new_complex_attr type array<my_struct> { source: complex_attr }
    }
}

In the basic summary, the fields use the same name as in the document schema. This works as expected. In the rename summary however, we rename the fields. This case is currently not working, and the query response doesn't contains these fields (they are empty).

@lundin The short term solution is to not rename such fields in the explicit document summary. We will also work on a fix for this issue.

lundin commented 3 years ago

Thanks for that @geirst ! The one problem i encountered with using the quick fix with same name/no renaming is that had the field in multiple summaries with different transform, i.e one with matched-element-only and one without and got "A field with the same name can not have different transforms in different summary classes"

But anyway, i removed the matched-element-only and made the filtering in the searcher instead.

Thanks!

geirst commented 2 years ago

This is fixed in Vespa 8.11.9 (https://github.com/vespa-engine/vespa/pull/23271, https://github.com/vespa-engine/vespa/pull/23272).

New system test added in https://github.com/vespa-engine/system-test/pull/2436.