R-a-dio / valkyrie

R/a/dio software stack
https://r-a-d.io
MIT License
5 stars 3 forks source link

irc: implement .search (WIP!) #40

Closed resttime closed 5 years ago

resttime commented 5 years ago

Working on #24, Go is a pretty interesting language. Currently echos result public. Hanyuu-sama utilizes elastic search so dunno whether to use that or do work with SQL.

Wessie commented 5 years ago

search does indeed go through elasticsearch which is annoying to test against; Kethsar wrote some code to hit the HTTP endpoint for it that we can use

Wessie commented 5 years ago
import (
        "fmt"
        "io/ioutil"
        "log"
        "net/http"
        "strings"
        "encoding/json"
)

var qstring = `{
        "query": {
                "match": {
                         "_all": {
                                 "query": "%s",
                                 "operator": "and"
                         }
                 }
         },
         "from": 0, "size": %d,
         "sort": [
                 { "requests": { "order": "desc" }},
                 { "_score": { "order": "desc" }}
         ]
 }`

func main() {
        limit := 5
        query := fmt.Sprintf(qstring, "3l", limit)
        qreader := strings.NewReader(query)

        resp, err := http.Post(elasticSearchURL, "application/json", qreader)
        if err != nil {
                log.Fatalln(err)
        }
        defer resp.Body.Close()

        respbytes, err := ioutil.ReadAll(resp.Body)
        if err != nil {
                log.Fatalln(err)
        }

        var obj interface{}
        err = json.Unmarshal(respbytes, &obj)
        if err != nil {
                log.Fatalln(err)
        }

        log.Printf("%v\n", obj)

        respjson := obj.(map[string]interface{})
        hits := respjson["hits"].(map[string]interface{})
        hitsarr := hits["hits"].([]interface{})

        for _, v := range hitsarr {
                vmap := v.(map[string]interface{})
                source := vmap["_source"].(map[string]interface{})
                log.Printf("%s - %s\n", source["artist"], source["title"])
        }
}

This code should use a proper struct to avoid user input injection and make it easier to read results

Bevinsky commented 5 years ago

I don't think the site uses _all for the query. Should we really use it?

And it sorts by 'requests', but I'm pretty sure it should be sorting by 'priority'.

Wessie commented 5 years ago

the query seems to be copied from https://github.com/R-a-dio/Hanyuu-sama/blob/1.2/manager/util.py#L17 so this is the existing behavior, should we change that?

Bevinsky commented 5 years ago

The issue with using _all is that it lets you search for anything that is indexed; including things like acceptor.

If you look at how the site does it, it explicitly specifies which indices are to be used. It also sorts by priority and not requests.


$params = [
    "type" => $type,
    "index" => $index,
    "ignore" => [
        400,
        404
    ],
    "body" => [
        "size" => 10000,
        "query" => [
            "query_string" => [
                "fields" => ["title", "artist", "album", "tags", "_id"],
                "default_operator" => "and",
                "query" => $terms,
            ]
        ],
        "sort" => [
            ["priority" => ["order" => "desc", "ignore_unmapped" => true]],
            ["_score"   => ["order" => "desc"]]
        ],
    ],
];
if ($usable_only) {
    $params["body"]["filter"] = [
        "bool" => [
            "must" => [
                "term" => ["usable" => 1]
            ]
        ]
    ];
}
Wessie commented 5 years ago

implemented this in 6654e81