koursaros-ai / nboost

NBoost is a scalable, search-api-boosting platform for deploying transformer models to improve the relevance of search results on different platforms (i.e. Elasticsearch)
Apache License 2.0
674 stars 69 forks source link

ReactiveSearch compatibility #27

Open 0mars opened 4 years ago

0mars commented 4 years ago

Hi, I tried to use it with reactive search, for react js

and it makes it crash, did you change the normal response. I'm using ES 6

colethienes commented 4 years ago

For normal _search queries it should only reorder the search results and add the nboost key.

Is there a console error message?

0mars commented 4 years ago

yea, I just had a quick look, will have a deeper one tomorrow and provide more details

0mars commented 4 years ago

nboost logs:

nboost_1                 | I:/opt/conda/lib/python3.6/site-packages/nboost/.cache/pt-tinybert-msmarco:[ber:__i: 17]:RUNNING ON CPU
nboost_1                 | I:PtBertModel:[pro:run:420]:{
nboost_1                 |     "model": "PtBertModel",
nboost_1                 |     "model_dir": "pt-tinybert-msmarco",
nboost_1                 |     "qa_model": null,
nboost_1                 |     "qa_model_dir": null,
nboost_1                 |     "data_dir": "/opt/conda/lib/python3.6/site-packages/nboost/.cache",
nboost_1                 |     "query_path": "(body.query.match) | (body.query.term.*) | (url.query.q)",
nboost_1                 |     "topk_path": "(body.size) | (url.query.size)",
nboost_1                 |     "true_cids_path": "body.nboost.cids",
nboost_1                 |     "choices_path": "body.hits.hits",
nboost_1                 |     "cvalues_path": "[*]._source.*",
nboost_1                 |     "cids_path": "[*]._id",
nboost_1                 |     "capture_path": "/.*/_search",
nboost_1                 |     "default_topk": 10,
nboost_1                 |     "host": "nboost",
nboost_1                 |     "port": 8000,
nboost_1                 |     "uhost": "xxxxxxxxxxx",
nboost_1                 |     "uport": 9200,
nboost_1                 |     "delim": ". ",
nboost_1                 |     "lr": 0.01,
nboost_1                 |     "max_seq_len": 64,
nboost_1                 |     "bufsize": 2048,
nboost_1                 |     "batch_size": 4,
nboost_1                 |     "multiplier": 5,
nboost_1                 |     "workers": 3,
nboost_1                 |     "filter_results": false
nboost_1                 | }
nboost_1                 | I:PtBertModel:[ser:run: 47]:Starting 3 workers...
nboost_1                 | C:PtBertModel:[ser:run: 54]:Listening on nboost:8000...

JS

{…}
​
_bodyBlob: Object { _data: {…} }
​
_bodyInit: Object { _data: {…} }
​
headers: Object { map: {…} }
​
ok: false
​
status: 502
​
statusText: undefined
​
type: "default"
​
url: "http://sem.xxxxxxxxxxx.com/myidx/_msearch?"
​
<prototype>: Object { … }
1ecc7cf2-1d07-49bf-8c60-a108c4509a30:23139:28
    reactConsoleErrorHandler blob:http://0.0.0.0:8081/1ecc7cf2-1d07-49bf-8c60-a108c4509a30:23139
    error blob:http://0.0.0.0:8081/1ecc7cf2-1d07-49bf-8c60-a108c4509a30:67319
    handleError blob:http://0.0.0.0:8081/1ecc7cf2-1d07-49bf-8c60-a108c4509a30:306796
    msearch blob:http://0.0.0.0:8081/1ecc7cf2-1d07-49bf-8c60-a108c4509a30:306860
    run blob:http://0.0.0.0:8081/1ecc7cf2-1d07-49bf-8c60-a108c4509a30:252733
    notify blob:http://0.0.0.0:8081/1ecc7cf2-1d07-49bf-8c60-a108c4509a30:252754
    flush blob:http://0.0.0.0:8081/1ecc7cf2-1d07-49bf-8c60-a108c4509a30:253135
    tryCallOne blob:http://0.0.0.0:8081/1ecc7cf2-1d07-49bf-8c60-a108c4509a30:24756
    handleResolved blob:http://0.0.0.0:8081/1ecc7cf2-1d07-49bf-8c60-a108c4509a30:24857
    id blob:http://0.0.0.0:8081/1ecc7cf2-1d07-49bf-8c60-a108c4509a30:28419
    _callTimer blob:http://0.0.0.0:8081/1ecc7cf2-1d07-49bf-8c60-a108c4509a30:28308
    _callImmediatesPass blob:http://0.0.0.0:8081/1ecc7cf2-1d07-49bf-8c60-a108c4509a30:28344
    callImmediates blob:http://0.0.0.0:8081/1ecc7cf2-1d07-49bf-8c60-a108c4509a30:28563
    __callImmediates blob:http://0.0.0.0:8081/1ecc7cf2-1d07-49bf-8c60-a108c4509a30:2187
    flushedQueue blob:http://0.0.0.0:8081/1ecc7cf2-1d07-49bf-8c60-a108c4509a30:1993
    __guard blob:http://0.0.0.0:8081/1ecc7cf2-1d07-49bf-8c60-a108c4509a30:2170
    flushedQueue blob:http://0.0.0.0:8081/1ecc7cf2-1d07-49bf-8c60-a108c4509a30:1992
    callFunctionReturnFlushedQueue blob:http://0.0.0.0:8081/1ecc7cf2-1d07-49bf-8c60-a108c4509a30:1961
    onmessage http://0.0.0.0:8081/debugger-ui/debuggerWorker.js:80
pertschuk commented 4 years ago

could you post the body of the nboost response?

0mars commented 4 years ago

@pertschuk unfortunately I don't have information, I suppose the msearch is not really supported by nboost, since it says capture path: nboost_1 | "capture_path": "/.*/_search",

and not _msearch.

Is there a debug/verbose mode for nboost to just show this information from the backend, since the frontend is a mobile app and reactive search is not really friendly in terms of debugging, and it's tedious a bit to dig deeper from there, I think it maybe is doable is just hectic

colethienes commented 4 years ago

You can use --verbose=True to output debug logging. Additionally, using debug=True adds the runtime configurations and parsed query/values to the nboost key of the json response.

jetnet commented 4 years ago

+1 for _msearch support :) The request body looks like this:

{"preference":"search"}
{"query":{"bool":{"must":[{"bool":{"must":{"bool":{"should":[{"multi_match":{"query":"ai","fields":["content^1","author^5","description^1","title^10"],"type":"cross_fields","operator":"and"}},{"multi_match":{"query":"ai","fields":["content^1","author^5","description^1","title^10"],"type":"phrase_prefix","operator":"and"}}],"minimum_should_match":"1"}}}}]}},"highlight":{"pre_tags":["<mark>"],"post_tags":["</mark>"],"fields":{"content":{},"author":{},"description":{},"title":{}},"require_field_match":false},"size":20}

as it isn't a valid JSON request, I guess it cannot be parsed by nboost. I tried the following path for query extraction:

--query_path "body.query.bool.must[0].bool.must.bool.should[0].multi_match.query"

but it didn't work:

D:PtBertModel:[pro:loo:348]:Request (192.168.99.1:62353): search.
W:PtBertModel:[pro:loo:400]:Request (192.168.99.1:62353): missing query

FYI: I'm using koursaros/nboost 0.1.1-pt, as the latest one does not have all required modules and cannot start.

0mars commented 4 years ago

any news about this, preferrably support for _msearch

ghost commented 3 years ago

Hi @0mars ,

Hope you are all well !

I am also interested by an integration with reactivesearch as I created a related project to this stack (https://github.com/twintproject/twint-search).

Did you manage to get it work together ? if yes, do you have some snippets or piece of code to share about the integration of nboost/reactivesearch ?

Cheers, X

0mars commented 3 years ago

@x0rzkov no, I couldn't. the problem is a compatibility with _msearch that reactivesearch uses versus _search which nboost uses.

we can try to modify reactive search to use the _search it will work.

so let me know if u wanna pair on this, I suck at javascript. but we can organize a zoom call we explore code and get it working together if u r interested, 2 brains are better than 1 🧠

ghost commented 3 years ago

Sure, do you have a twitter account so we can DM together. Mine is https://twitter.com/x0rzkov.

For now, before integrating reactivesearch, I need to fix this issue first before going further https://github.com/koursaros-ai/nboost/issues/83 but yeah a call can be a great idea.

Did you post an issue/question on reactivesearch repository about it ?

0mars commented 3 years ago

Did you post an issue/question on reactivesearch repository about it ? no

twtr: omars_music

jhofeditz commented 3 years ago

Someone asked this question on the ReactiveSearch github: https://github.com/appbaseio/reactivesearch/issues/471 and the response was to use the transformRequest property of ReactiveBase component. I was able to get ReactiveSearch to use the _search endpoint however Nboost appears to only support the basic GET ?q= query string search and ReactiveSearch only uses the POST nbjson query format.

I wonder if Nboost could use the full ElasticSearch DSL on POST in addition to the GET simple query format?

transformRequest={(props) => ({
          ...props,
          headers: { ...props.headers, "Content-Type": "application/json" },
          props.body.replace(/{"preference":".*?"}\n/, ""),
        })}
0mars commented 3 years ago

@pertschuk @colethienes please advise!

Thanks