vespa-engine / vespa

AI + Data, online. https://vespa.ai
https://vespa.ai
Apache License 2.0
5.5k stars 587 forks source link

Indexing language produces unexpected results with `if` used in `for_each` #29690

Open dainiusjocas opened 7 months ago

dainiusjocas commented 7 months ago

Describe the bug A clear and concise description of what the bug is.

To Reproduce Schema is as follows:


schema doc {
 field modified_chunks type array<string> {
        indexing {
            input chunks | for_each {
                if (1 == 1) {
                    "YES " . _;
                } else {
                    "NO " . _;
                }
            } | summary;
        }
    }

   document doc {
        field chunks type array<string> {
            indexing: summary | index
        }
    }
}

**Expected behavior**
Indexed text in the field `modified_chunks` is `YES some text`. Because `if` produces an output and that output should be used by the `for_each`.

**Environment (please complete the following information):**
 - macos
 - Infrastructure: docker
 - Versions newest

**Vespa version**
8.256.22

**Additional context**
The actual task was to avoid embedding chunks that are very short. If the above indexing script would work as expected, then that would be doable.
dainiusjocas commented 4 months ago

Some more workarounds that are needed in for_each can be found in this repo.

dainiusjocas commented 4 months ago

Related issue https://github.com/vespa-engine/vespa/issues/30512