nck-2 / test-rep

0 stars 0 forks source link

PQ: match with an unavailable attribute in the doc #1368

Open githubmanticore opened 1 year ago

githubmanticore commented 1 year ago

The 1st match seems to be wrong:

mysql> call pq('pq_sina', '{"text": "abc"}', 1 as query);   
 ------ ----------- ------ ---------------------------------    
| UID  | Query     | Tags | Filters                         |   
 ------ ----------- ------ ---------------------------------    
|    1 |           |      | NOW() - json.received_at > 3600 |   
|    4 | @text abc |      |                                 |   
 ------ ----------- ------ ---------------------------------    
2 rows in set (0.00 sec)   

mysql> desc pq_sina table;   
 -------------------- --------    
| Field              | Type   |   
 -------------------- --------    
| id                 | bigint |   
| text               | field  |   
| parent_text        | field  |   
| user_screen_name   | field  |   
| user_description   | field  |   
| reply_comment_text | field  |   
| json               | json   |   
 -------------------- --------    
7 rows in set (0.00 sec)   

The config is:

index pq_sina   
{   
        type = percolate   
        path = idx_pq   
        rt_field = text   
        rt_field = parent_text   
        rt_field = user_screen_name   
        rt_field = user_description   
        rt_field = reply_comment_text   
        rt_attr_json = json   
        index_sp = 1   
}   

searchd   
{   
        listen = 9314:mysql41   
        log = sphinx_pq.log   
        pid_file = sphinx_pq.pid   
        binlog_path =   
}   

I don't even have json.received_at in the document, why would it match with the rule?

githubmanticore commented 1 year ago

➤ Stan commented:

still actual

added such query to test 125 and got reply

sphinxql-157> SELECT id, now()-j.lit as t, j from test_json; 
    id  t   j 
    1   1588750657  {"name":"Alice","uid":123,"lon":-0.079858,"lat":0.937717,"pct":12.400000,"sq":9,"poly":"1,2,3,4,5,6.0","points":[1,2,3,4,5,6.000000]} 
    2   1588750657  {"name":"Bob","uid":234,"gid":12,"lon":-0.079999,"lat":0.891975,"pct":-103.700000,"sq":16,"poly":"1,-2,1,2,-5,6","points":[1,-2,1,2,-5,6]} 
    3   1588750657  {"name":"Charlie","uid":345,"lon":-0.072146,"lat":0.926761,"pct":4.100000,"sq":225,"poly":"-1,2,12,4,5,6","points":[-1,2,12,4,5,6]} 
3 rows in set 

seems JSON.attr at expression returns 0.

We could try to fix that by transform expression at filter into form j.lit IS NOT null and expr in case expression has JSON attribute, ie NOW() - json.received_at > 3600 will get (json.received_at IS NOT null) AND (NOW() - json.received_at > 3600)

Need to check side effect and pref degradation for such fix.