aerospike / aerospike-client-go

Aerospike Client Go
Apache License 2.0
431 stars 198 forks source link

client (v7 v7.6.0) NewWildCardValue not working in Secondary Indexes #449

Open asado opened 2 weeks ago

asado commented 2 weeks ago

Description I am encountering an issue with creating a complex secondary index in Aerospike. The goal is to index all values under the key "T" in a nested map structure. However, the secondary index is only indexing values under the first key in alphabetical order in the outer map, rather than indexing values under all relevant keys.

Code Example

indexTask, aeroErr := s.aeroDB.CreateComplexIndex(
    aero.NewWritePolicy(0, 0),
    "test",
    "set_name",
    "tokens_key_index",
    "sotBin",
    aero.STRING,
    aero.ICT_MAPVALUES,
    aero.CtxMapValue(aero.NewWildCardValue()), // Dynamic columnID key
    aero.CtxMapKey(aero.StringValue("T")), 
)

Expected Behavior The secondary index should index all values under the key "T". For example, it should index the values "foo1", "foo2", "doo1", and "doo2".

Actual Behavior The index is only indexing the values under the first key in alphabetical order in the outer map. In this case, it only indexes the values under "doo" (i.e., "doo1" and "doo2"), and it does not index the values under "foo".

Example Data Structure Here’s an example of the data structure being indexed:

{
    "foo": {
       "T": {
          "cat1": "foo1",
          "cat2": "foo2"
       },
       "A": {
          "abc": "xyz1"
       }
    },
    "doo": {
       "T": {
          "cat1": "doo1",
          "cat2": "doo2"
       },
       "A": {
          "abc": "xyz2"
       }
    }
}

Steps to Reproduce Create a map structure similar to the one shown above. Attempt to create a secondary index using the provided code. Observe that only the values under the first key in alphabetical order (i.e., "doo") are being indexed.

Environment Details Aerospike Server Version: 7.0.0.0_2 Aerospike Client Version: v7 v7.6.0

Additional Information If the behavior is intended, please provide guidance on how to correctly index all values under "T" in the nested map. If not, any fixes or workarounds would be greatly appreciated.

khaf commented 2 weeks ago

Can you please post a little code gist that reproduces the issue, containing your query?

asado commented 2 weeks ago
package main

import (
    "bytes"
    "flag"
    "log"
    "os"
    "time"

    aero "github.com/aerospike/aerospike-client-go/v7"
)

var (
    Host = flag.String("h", "localhost", "Aerospike server seed hostnames or IP addresses.")
    Port = flag.Int("p", 3000, "Aerospike server seed hostname or IP address port number.")
)

func main() {
    flag.Parse()

    var buf bytes.Buffer
    logger := log.New(&buf, "", log.LstdFlags|log.Lshortfile)
    logger.SetOutput(os.Stdout)

    // connect to the host
    clientPolicy := aero.NewClientPolicy()

    h := []*aero.Host{
        aero.NewHost(*Host, *Port),
    }
    log.Println("Hosts:", h)
    client, err := aero.NewClientWithPolicyAndHost(clientPolicy, h...)
    if err != nil {
        log.Fatalln(err)
    }

    defer client.Close()

    if err = client.DropIndex(nil, "test", "test", "tokens_key_index"); err != nil {
        log.Fatalln(err)
    }
    indexTask, err := client.CreateComplexIndex(
        aero.NewWritePolicy(0, 0),
        "test",
        "test",
        "tokens_key_index",
        "sotBin",
        aero.STRING,
        aero.ICT_MAPVALUES,
        aero.CtxMapValue(aero.NewWildCardValue()),
        aero.CtxMapKey(aero.StringValue("T")),
    )

    if err != nil {
        log.Fatalln(err)
    }

    if err = <-indexTask.OnComplete(); err != nil {
        log.Fatalln(err)
    }

    time.Sleep(time.Second * 3)

    bins := aero.BinMap{
        "sotBin": map[string]any{
            "foo": map[string]any{
                "T": map[string]any{
                    "cat1": "bar1",
                    "cat2": "bar2",
                },
                "A": map[string]any{
                    "abc": "xyz",
                },
            },
            "doo": map[string]any{
                "T": map[string]any{
                    "cat1": "bar3",
                    "cat2": "bar4",
                },
                "A": map[string]any{
                    "abc": "xyz",
                },
            },
        },
    }

    key, _ := aero.NewKey("test", "test", 1)
    if err = client.Put(nil, key, bins); err != nil {
        log.Fatalln(err)
    }

    stmt := aero.NewStatement("test", "test")
    if err = stmt.SetFilter(aero.NewContainsFilter("sotBin", aero.ICT_MAPVALUES, "bar3",
        aero.CtxMapValue(aero.NewWildCardValue()), // Dynamic columnID key
        aero.CtxMapKey(aero.StringValue("T")),
    )); err != nil {
        log.Fatalln(err)
    }
    rs, err := client.Query(nil, stmt)
    if err != nil {
        log.Fatalln(err)
    }

    for res := range rs.Results() {
        if res.Err != nil {
            log.Fatalln(err)
        }

        log.Println(res.Record.Bins)
    }

    log.Println("Done!")
}

The above snippet will give you the result when you query for "bar3". If you change it to "bar1", it will not give you any result which is not the expectation, given we have used wildcard.

khaf commented 2 weeks ago

@kportertx This seems to be a server issue.

kportertx commented 2 weeks ago

The CDT wildcard can only go at the end of a CDT. What @asado is trying to do is currently not possible.

asado commented 2 weeks ago

@kportertx Thanks for clarifying. We will have to change our bin modelling if this is not supported.

It is silently indexing the first entry of the wildcard CDT in sorted alphabetic order as in the example above. We should probably fail the index creation if it's not supported.

kportertx commented 2 weeks ago

@asado, I agree, this and the assertion crash (#448) that you reported are a result of the server's sindex module not handling cdt cmp-wildcards. This issue was introduced with 7.0.0.0 and resolved with 7.0.0.8 - the latest hotfix version for the 7.0.0 lineage is 7.0.0.14.

asado commented 2 weeks ago

@kportertx yes. I can confirm the crash is not happening in 7.0.0.14.

It is still indexing the first entry of the wildcard CDT in sorted alphabetic order as in the example above. would you be adding support for wildcard across CDT order or fail if wildcard is not at the end of a CDT?

xorphox commented 2 weeks ago

This NewWildCardValue() is called cmp_wildcard on the server (and the C client). It is only for comparing and not to multi-select. In this case it is comparing true to the first element so it is working as expected.

Again, cmp_wildcard is for comparing only (it compares true with anything, including all elements until the end of the list) so it will not get extended to multi-select. There may be another feature later for that purpose.