Closed eroux closed 2 months ago
so, I've been able to test it locally:
there are issues with reusing the data sent to the standard API though and I'm not sure where to go from here @eroux @roopeux :
{"index":"bdrc_prod"}
{"from":0,"size":20,"aggs":{"associatedCentury":{"terms":{"field":"associatedCentury","size":20}},"associatedTradition":{"terms":{"field":"associatedTradition","size":20}},"author":{"terms":{"field":"author","size":20}},"etext_access":{"terms":{"field":"etext_access","size":20}},"etext_quality":{"range":{"field":"etext_quality","ranges":[{"from":0,"to":0.8},{"from":0.8,"to":0.95},{"from":0.95,"to":1.01},{"from":1.99,"to":2.01},{"from":2.99,"to":3.01},{"from":3.99,"to":4.01}]}},"inCollection":{"terms":{"field":"inCollection","size":20}},"language":{"terms":{"field":"language","size":20}},"personGender":{"terms":{"field":"personGender","size":20}},"printMethod":{"terms":{"field":"printMethod","size":20}},"scans_access":{"terms":{"field":"scans_access","size":20}},"script":{"terms":{"field":"script","size":20}},"translator":{"terms":{"field":"translator","size":20}},"type":{"terms":{"field":"type","size":20}},"workGenre":{"terms":{"field":"workGenre","size":20}},"workIsAbout":{"terms":{"field":"workIsAbout","size":20}}},"highlight":{"fields":{"prefLabel_bo_x_ewts":{},"altLabel_bo_x_ewts":{},"prefLabel_en":{},"altLabel_en":{},"seriesName_bo_x_ewts":{},"seriesName_en":{},"content_en":{},"comment_bo_x_ewts":{},"comment_en":{}}},"query":{"function_score":{"script_score":{"script":{"id":"bdrc-score"}},"query":{"bool":{"filter":[],"must":[{"multi_match":{"type":"phrase","query":"spyod 'jug","fields":["seriesName_bo_x_ewts^0.1","seriesName_en^0.1","authorshipStatement_bo_x_ewts^0.005","authorshipStatement_en^0.005","publisherName_bo_x_ewts^0.01","publisherLocation_bo_x_ewts^0.01","publisherName_en^0.01","publisherLocation_en^0.01","prefLabel_bo_x_ewts^1","prefLabel_en^1","comment_bo_x_ewts^0.0001","comment_en^0.0001","altLabel_bo_x_ewts^0.6","altLabel_en^0.6"]}}]}}}}}
{"index":"bdrc_prod"}
{"from":0,"size":20,"aggs":{"associatedCentury":{"terms":{"field":"associatedCentury","size":20}},"associatedTradition":{"terms":{"field":"associatedTradition","size":20}},"author":{"terms":{"field":"author","size":20}},"etext_access":{"terms":{"field":"etext_access","size":20}},"etext_quality":{"range":{"field":"etext_quality","ranges":[{"from":0,"to":0.8},{"from":0.8,"to":0.95},{"from":0.95,"to":1.01},{"from":1.99,"to":2.01},{"from":2.99,"to":3.01},{"from":3.99,"to":4.01}]}},"inCollection":{"terms":{"field":"inCollection","size":20}},"language":{"terms":{"field":"language","size":20}},"personGender":{"terms":{"field":"personGender","size":20}},"printMethod":{"terms":{"field":"printMethod","size":20}},"scans_access":{"terms":{"field":"scans_access","size":20}},"script":{"terms":{"field":"script","size":20}},"translator":{"terms":{"field":"translator","size":20}},"type":{"terms":{"field":"type","size":20}},"workGenre":{"terms":{"field":"workGenre","size":20}},"workIsAbout":{"terms":{"field":"workIsAbout","size":20}}},"highlight":{"fields":{"prefLabel_bo_x_ewts":{},"altLabel_bo_x_ewts":{},"prefLabel_en":{},"altLabel_en":{},"seriesName_bo_x_ewts":{},"seriesName_en":{},"content_en":{},"comment_bo_x_ewts":{},"comment_en":{}}},"query":{"function_score":{"script_score":{"script":{"id":"bdrc-score"}},"query":{"bool":{"filter":[{"bool":{"should":[{"term":{"type":"PartTypeText"}}]}}],"must":[{"multi_match":{"type":"phrase","query":"spyod 'jug","fields":["seriesName_bo_x_ewts^0.1","seriesName_en^0.1","authorshipStatement_bo_x_ewts^0.005","authorshipStatement_en^0.005","publisherName_bo_x_ewts^0.01","publisherLocation_bo_x_ewts^0.01","publisherName_en^0.01","publisherLocation_en^0.01","prefLabel_bo_x_ewts^1","prefLabel_en^1","comment_bo_x_ewts^0.0001","comment_en^0.0001","altLabel_bo_x_ewts^0.6","altLabel_en^0.6"]}}]}}}}}
{"index":"bdrc_prod"}
{"from":0,"size":20,"aggs":{"type":{"terms":{"field":"type","size":20}}},"highlight":{"fields":{"prefLabel_bo_x_ewts":{},"altLabel_bo_x_ewts":{},"prefLabel_en":{},"altLabel_en":{},"seriesName_bo_x_ewts":{},"seriesName_en":{},"content_en":{},"comment_bo_x_ewts":{},"comment_en":{}}},"query":{"function_score":{"script_score":{"script":{"id":"bdrc-score"}},"query":{"bool":{"filter":[],"must":[{"multi_match":{"type":"phrase","query":"spyod 'jug","fields":["seriesName_bo_x_ewts^0.1","seriesName_en^0.1","authorshipStatement_bo_x_ewts^0.005","authorshipStatement_en^0.005","publisherName_bo_x_ewts^0.01","publisherLocation_bo_x_ewts^0.01","publisherName_en^0.01","publisherLocation_en^0.01","prefLabel_bo_x_ewts^1","prefLabel_en^1","comment_bo_x_ewts^0.0001","comment_en^0.0001","altLabel_bo_x_ewts^0.6","altLabel_en^0.6"]}}]}}}}}
etext_quality
that I fixed earlier today, we send a filter
parameter within the query
parameter:
"query": {
"function_score": {
"script_score": {
"script": {
"id": "bdrc-score"
}
},
"query": {
"bool": {
"filter": [
{
"bool": {
"should": [
{
"range": {
"etext_quality": {
"gte": "3.99",
"lte": "4.01"
}
}
}
]
}
}
],
"must": [
{
"multi_match": {
"type": "phrase",
"query": "spyod 'jug",
(...)
oh I see, thanks! What we need to do in that case is
"query": {
"function_score": {
"script_score": {
"script": {
"id": "bdrc-score"
}
},
"query": {
"bool": {
"filter": [
{
"bool": {
"should": [
{
"range": {
"etext_quality": {
"gte": "3.99",
"lte": "4.01"
}
}
}
]
}
}
],
"bdrc-query": "spyod 'jug",
(...)
I think... @roopeux can you adjust and make that work in the Python code? Generally speaking I think the python code should just look for "bdrc-query"
as a key anywhere, not just in a specific json path
@berger-n it should work now, see https://github.com/buda-base/autocomplete-prototype/commit/71205d450f702b10127137ccfd30b73190e5da0d
ok thanks! getting this error:
{"error":{"reason":"[1:1493] [bool] unknown field [dis_max]","root_cause":[{"reason":"[1:1493] [bool] unknown field [dis_max]","type":"x_content_parse_exception"}],"type":"x_content_parse_exception"},"status":400}
with the following query
:
{
"function_score": {
"script_score": {
"script": {
"id": "bdrc-score"
}
},
"query": {
"bool": {
"filter": [],
"bdrc-query": "spyod 'jug"
}
}
}
}
here's the full curl request if needed:
curl 'https://autocomplete.bdrc.io/search' \
-H 'Accept: */*' \
-H 'Accept-Language: fr-FR,fr;q=0.9,en-US;q=0.8,en;q=0.7,zh-CN;q=0.6,zh;q=0.5' \
-H 'Cache-Control: no-cache' \
-H 'Connection: keep-alive' \
-H 'Content-Type: application/json' \
-H 'Origin: http://localhost:3000' \
-H 'Pragma: no-cache' \
-H 'Referer: http://localhost:3000/' \
-H 'Sec-Fetch-Dest: empty' \
-H 'Sec-Fetch-Mode: cors' \
-H 'Sec-Fetch-Site: cross-site' \
-H 'User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36' \
-H 'sec-ch-ua: "Not/A)Brand";v="8", "Chromium";v="126", "Google Chrome";v="126"' \
-H 'sec-ch-ua-mobile: ?0' \
-H 'sec-ch-ua-platform: "Linux"' \
--data-raw $'{"from":0,"size":20,"aggs":{"associatedCentury":{"terms":{"field":"associatedCentury","size":20}},"associatedTradition":{"terms":{"field":"associatedTradition","size":20}},"author":{"terms":{"field":"author","size":20}},"etext_access":{"terms":{"field":"etext_access","size":20}},"etext_quality":{"range":{"field":"etext_quality","ranges":[{"from":0,"to":0.8},{"from":0.8,"to":0.95},{"from":0.95,"to":1.01},{"from":1.99,"to":2.01},{"from":2.99,"to":3.01},{"from":3.99,"to":4.01}]}},"inCollection":{"terms":{"field":"inCollection","size":20}},"language":{"terms":{"field":"language","size":20}},"personGender":{"terms":{"field":"personGender","size":20}},"printMethod":{"terms":{"field":"printMethod","size":20}},"scans_access":{"terms":{"field":"scans_access","size":20}},"script":{"terms":{"field":"script","size":20}},"translator":{"terms":{"field":"translator","size":20}},"type":{"terms":{"field":"type","size":20}},"workGenre":{"terms":{"field":"workGenre","size":20}},"workIsAbout":{"terms":{"field":"workIsAbout","size":20}}},"highlight":{"fields":{"prefLabel_bo_x_ewts":{},"altLabel_bo_x_ewts":{},"prefLabel_en":{},"altLabel_en":{},"seriesName_bo_x_ewts":{},"seriesName_en":{},"content_en":{},"comment_bo_x_ewts":{},"comment_en":{}}},"query":{"function_score":{"script_score":{"script":{"id":"bdrc-score"}},"query":{"bool":{"filter":[],"bdrc-query":"spyod \'jug"}}}}}'
@eroux the bug is from your last push. I'll replace it with mine, which seems to work.
ok
@berger-n the change is deployed, can you give it a try?
@berger-n in the upcoming API, do not change anything, do not add 'bdrc-query', just send it raw to the API. The API will find the query.
I sent the API to @eroux because I could not push it.
Roope coded a small API in python to improve some of the results. The way to call the API is to send a regular elasticsearch json to
https://autosuggest.bdrc.io/search
(no ES credentials needed), and it returns a usual elasticsearch results json. The main trick is to have the search string in abdrc-query
object, as exemplified in the README