Closed amaleelhamri closed 3 years ago
The ordering of result is determined by the ranking profile used. A 3 term OR query will rank differently than 3 single term queries with the default ranking profile. A OR B OR C retrieves documents where either A,B,C are present. equiv is an operator for query synonym expansion and where the ranking will be determined by the best ranking individual term.
The problem is not that they dont appear with same order but that they are genuinely different.
1. Query 1 with no "OR" :
{
"yql" : "SELECT text_raw, language, categories FROM sources post WHERE language CONTAINS 'en' AND categories CONTAINS 'hairmanagement' AND text_raw CONTAINS 'Vely Vely';"
}
As you can see in the response above, only one document matched this query and it does not even contains the string 'Vely Vely' that is in my query:
{
"root": {
"id": "toplevel",
"relevance": 1.0,
"fields": {
"totalCount": 1
},
"coverage": {
"coverage": 100,
"documents": 114292026,
"full": true,
"nodes": 3,
"results": 1,
"resultsFull": 1
},
"children": [
{
"id": "index:post/2/e735c16d7a6a06e569c0e2e4",
"relevance": 0.22097962996725581,
"source": "post",
"fields": {
"text_raw": "Age of Empires II: Definitive Edition #2 CHO TÔI MỘT VÉ VỀ TUỔI THƠ :)) Đây là game Age of Empires II: Definitive Edition được live trực tiếp trên kênh Deralam của tôi. Rất mong nhận được sự ủng hộ của anh em. ►FOLLOW FANPAGE https://www.facebook.com/deralamgamer/ ----------------------------------------------------------------------------------- ►LỊCH STREAM GAME MỖI NGÀY 8:00PM - 11:00PM ----------------------------------------------------------------------------------------- ►DONATE ĐỂ Deralam MUA NHIỀU GAME HƠN TẠI ACB BANK : 138412989 - CN Phú Thọ Paypal: buithaiduongyt@gmail.com https://playerduo.com/deralam https://streamlabs.com/deralamvietsub Hoặc tặng skin, item, game https://steamcommunity.com/id/Deralam ------------------------------------------------------------------------------------------- ►THAM GIA HỘI HOA HỒNG LỬA https://www.facebook.com/groups/1224594424324001/?ref=bookmarks ------------------------------------------------------------------------------------------------ ►Cấu hình Stream Console: PS4 Pro Main: Gigabyte Z370 Aorus Ultra Gaming CPU: Intel® Core™ i7-8700K GPU: GeForce® GTX 1070 GameRock Premium Edition PSU: Seasonic M12II 850 EVO Memory: 16GB Ram Team T-Force Delta RGB DDR4 SSD : 2 x Samsung 850 EVO 250GB, SSD Crucial P1 1TB HDD: 2 x Western Digital 1TB Black Cooler: Thermalright True Spirit 140 Direct Case: Cougar Panzer Dell UltraSharp 25\" U2515H / Dell P2214H Mouse: Razer DeathAdder Elite Pad: Razer Goliathus Medium Control Gravity Edition Keyboard: Filco Majestouch 2 / Razer BlackWidow Ultimate Stealth 2016 Headphone: SteelSeries Arctis 3 Black / Earphone: Razer Hammerhead Pro v2 Micro: Blue Yeti Controller: Xbox 360 Gaming Chair: DXRACER cùng chơi, trực tiếp game, truc tiep game, live stream, best game, tuyệt phẩm, game độc quyền, game PS4, game PC, game bom tấn, game hay nhất, game kinh dị, game sinh tồn, game coop, hài hước, giải trí, thư giãn, walkthrough, let's play, deralam",
"language": "en",
"categories": [
"hairmanagement"
]
}
}
]
}
}
2. Query 2 with no "OR"
{
"yql" : "SELECT text_raw, language, categories FROM sources post WHERE language CONTAINS 'en' AND categories CONTAINS 'hairmanagement' AND text_raw CONTAINS '블리블리';"
}
As you can see in the response below this query does not have any matches :
{
"root": {
"id": "toplevel",
"relevance": 1.0,
"fields": {
"totalCount": 0
},
"coverage": {
"coverage": 100,
"documents": 114292026,
"full": true,
"nodes": 3,
"results": 1,
"resultsFull": 1
}
}
}
3. Query with "OR"
But when I run the same query with multiple OR statements :
{
"yql" : "SELECT text_raw, language, categories FROM sources post WHERE language CONTAINS 'en' AND categories CONTAINS 'hairmanagement' AND (text_raw CONTAINS '블리블리' OR text_raw CONTAINS 'Vely Vely');"
}
As you can see in the response below this query cannot be the union between the two first queries because it has 78645 matches witch is not the sum of both first queries.
{
"root": {
"id": "toplevel",
"relevance": 1.0,
"fields": {
"totalCount": 78645
},
"coverage": {
"coverage": 100,
"documents": 114292026,
"full": true,
"nodes": 3,
"results": 1,
"resultsFull": 1
},
"children": [
{
"id": "index:post/1/f17ab67754005b147734113b",
"relevance": 0.2258859525086265,
"source": "post",
"fields": {
"text_raw": "💥💥💥 HÀNG VỀ HÀNG VỀ 💥💥💥 💯💯💯 Các hãng cao cấp từ xứ sở Kim Chi : Sulwhasoo, Whoo, Sum:37, OHUI kịp về thêm nhiều sản phẩm phục vụ như cầu làm đẹp của chị em trước Tết đây ạ ✨✨✨ 🌟 Bản limited mới nhất của Sulwhasoo dòng Perfecting Cushion Ex các tone 15, 21, 23 🌟 Sulwhasoo Perfecting cushion intense bản Limited cực sang 🌟 Sulwhasoo Sheer Lasting Gel Cushion về đủ tone 17, 21, 23 🌟 Phấn phủ cao cấp Whoo Powder pact và Whoo Whitening Powder Pact bổ sung dưỡng trắng da 🌟 Cushion OHUI Ultimate Cover Cushion Moisture Special Set Đen cho da thường đến khô 🌟 Set dưỡng Ohui Prime Advancer Ampoule Capture Cream Special Set 🌟 Set Ohui Prime Advancer Ampoule Serum Set 8 sản phẩm với 2 chai serum cực hời 🌟 Set mini Ohui Prime Advancer cho bạn nào muốn dùng thử ạ 🌟 Dòng dưỡng chống lão hóa cao cấp Ohui Age Recovery về set to 🌟 Set nước thần dạng xịt vô cùng tiện lợi Secret Essence Mist Special Set 🌟 Set nước thần Secret Essence tặng kèm các sản phẩm mini 🌟 Whoo Hwa hyun Radiant special gift set dưỡng sáng và chống lão hoá cao cấp 🌟 Set 3 sản phẩm Mặt nạ đất sét , Mặt nạ ngủ, Kem massage của Sulwhasoo bên mình có tách set bán lẻ nha 🌟 Overnight revitalizing mask size lớn 120ml 🌟 Mặt nạ 3 bước chống lão hoá, tái tạo da Whoo Royal Anti Aging 3 Step Set Mask ✔ HÀNG CHUẨN AUTH ✔ GIÁ CHUẨN RẺ 🍉 🍉SHIP COD TOÀN QUỐC 🍉 🍉 ———————————- 💒 Cơ sở 1: 95B Lý Nam Đế - 02 466822225 💒 Cơ sở 2: 297 Phố huế - 02 466802468 ☎️ Hotline + ship hàng : 0778211211 ———————————- ❌❌ Các bạn vào : ❤️ http://hanhstore.com/ ❤️ Để biết thêm thông tin chi tiết của Sản phẩm và đặt hàng nha 😍😍 ❌❌",
"language": "en",
"categories": [
"skincare",
"hairmanagement"
]
}
},
{
"id": "index:post/1/19e4e1c21dbbe2f2ec35aee5",
"relevance": 0.21692335219599163,
"source": "post",
"fields": {
"text_raw": "I've been doing CGM for 3 months now after blow drying/straightening my hair for 15 years. My hair is chest length, mostly 2b with some 2a/2c, coarse, and dense. I think it is low porosity because it \"squeaks\" when I run my fingers up a strand and takes about 6 or 7 hours to air dry if I don't diffuse, and I feel like products \"sit\" on my hair sometimes. I finally figured out a routine that makes my hair look great. I wash and condition with the Shea Moisture Mafura Oil and Manuka Honey shampoo and conditioner. I rinse out all conditioner (squish to condish didn't work well), and while hair is soaking wet, I put in 4 pumps of Uncle Funky's Curly Magic, comb it in evenly with a wide tooth comb (I've found I've had to comb in product to ensure even distribution - praying hands and scrunching in doesn't work). I then add 6 pumps of Renpure Viva Curl Coconut Lite Defining Gel, comb it in again with wide tooth comb. Then I scrunch out water, scrunch again with a microfiber towel to help dry, plop in a different microfiber towel for 30 minutes, let air dry while I do makeup, etc., and then I diffuse on low until almost dry, and then scrunch out the cast once it completely dries. I was lucky enough to have an old bottle of the Renpure Lite Defining Gel lying around, because I recently discovered it was discontinued 2 years ago, and I don't have much left. I tried Renpure's current gel product, their Coconut Creme Curling Jelly, and it just isn't the same. I feel like it has less hold and doesn't create as good of a cast, as I have much less definition, more frizz, and the waves at my roots completely fall out when I use it. I've seen a lot of wavies recommend Jessicurl Spiralicious as a strong hold gel, along with the Rockin' Ringlets curl enhancer. Rockin' Ringlets was probably the worst thing I ever did for my hair, left it a frizzy, undefined mess. I tried the Spiralicious gel both with and without the Rockin' Ringlets, and also with the Uncle Funky's Curly Magic, and every time it was just not good. It barely had any hold, weak cast, my waves were very undefined, and I felt like the product \"sat\" on my hair and made my roots flat, giving me the dreaded \"triangle\" shape. I've tried using more gel, but I feel like my hair doesn't absorb it and if I use too much it becomes impossible to scrunch out. I've tried searching for other recommended alternatives to the Renpure Lite Defining Gel and haven't found much information. Does anyone have recommendations of a gel that may work for my hair? Maybe any styling techniques that could improve my definition and frizz?",
"language": "en",
"categories": [
"hairmanagement"
]
}
}
Any guidance to resolve this pattern would be appreciated!
if you add &tracelevel=3 it's easier to debug how the the query is parsed and executed against the content nodes. Two messages that are of interest in the trace
YQL+ representation ...
And search to dispatch ..
If you can include that and the field definition of text_raw and any set_language or stemming overrides (https://docs.vespa.ai/documentation/linguistics.html)
Field definition :
field text_raw type string {
indexing: index | summary
}
There are no set_language or stemming overrides.
Results with &tracelevel=3:
{
"message":"YQL+ query parsed: [select text_raw, language, categories from post where (language contains \"en\" AND categories contains \"hairmanagement\" AND (text_raw contains \"\블\리\블\리\" OR text_raw contains ([{\"origin\": {\"original\": \"Vely Vely\", \"offset\": 0, \"length\": 9}}]phrase(\"Vely\", \"Vely\")))) timeout 200000;]"
}
{
"message": "sc0.num0 search to dispatch: query=[AND language:en categories:hairmanagement (OR text_raw:블리블리 (SAND text_raw:ve text_raw:ve))] timeout=199997ms offset=0 hits=10 groupingSessionCache=true sessionId=container.container.0.1606856574836.996424.default grouping=0 : restrict=[post]"
}
Does this : (SAND text_raw:ve text_raw:ve)
mean that it matches all texts containing "ve" ? I'm having trouble understanding it.
Yes, its been stemmed to ve so that is what is matched against the index and that is why the document in your first example is retrieved. You will see a trace further above in the trace about the stemming and which language was guessed. I'm thinking that the multi-language query throws the language detection. Try passing &language=en to explicit
Adding language=en worked! So from my understanding, this happened because it was a multi-language query and that it didnt stem correctly the non english word ? But when passing it as monolingual query with the non english word, it was stemmed correctly because language was well detected that's why results were not consistent ?
Yes, I think that is accurate. You can force disable language guessing by passing language=en, since you have not done anything on the document side (set_language) you should pass language=en at query time to avoid potential asymmetric behaviour at query time versus indexing time.
All clear. Thanks a lot for your help!
Hello vespa team,
I am struggling with a weird pattern with my vespa queries outputs. When I run a query with the following filter :
WHERE field CONTAINS ‘word1’ OR field CONTAINS ‘word2’ OR field CONTAINS ‘word3’
it gives me results that don’t match any of the queries separatelyWHERE field CONTAINS ‘word1'
,WHERE field CONTAINS ‘word2'
,WHERE field CONTAINS ‘word3'
. I tried also running this query withWHERE field CONTAINS equiv(‘word1’, ‘word2’, ‘word3’)
which lead to same patterns.When looking into the outputs with "OR" or "equiv", none of them actually contains one of those 3 words. Any hints on how to overcome this issue ?
I am using Vespa version: 7.314.13