manticoresoftware / manticoresearch

Easy to use open source fast database for search | Good alternative to Elasticsearch now | Drop-in replacement for E in the ELK soon
https://manticoresearch.com
GNU General Public License v3.0
8.99k stars 500 forks source link

Slow performance SNIPPETS on agent index #568

Open daikoz opened 3 years ago

daikoz commented 3 years ago

Hi,

To simplify the issue, i have 2 index:

....  

index WebPages  
{  
    type  = distributed  

    local = WebPages0  
    local = WebPages1  
    local = WebPages2  
    local = WebPages3  
}  

index WebPagesLB  
{  
     type             = distributed  
     agent_persistent = sphinxserver:9312:WebPages  
     ha_strategy      = nodeads  
}  

When i execute SNIPPET on WebPages, the result time is ~40ms:

ELECT Id, SNIPPET(Body,  QUERY()) FROM WebPages WHERE MATCH('modele');`  

Now, I execute SNIPPET on WebPagesLB and the result time is 1.2s!!!  

ELECT Id, SNIPPET(Body, QUERY()) FROM WebPagesLB WHERE MATCH('modele');`

If I remove SNIPPET call, the result time is same.

sphinxserver is localhost.

Why ?

githubmanticore commented 3 years ago

➤ Sergey Nikolaev commented:

I can't reproduce it like this:

snikolaev@dev:~$ cat csv_dist.conf 
source src { 
    type = csvpipe 
    csvpipe_command = for n in `seq 1 100000`; do echo -n "$n,"; echo $n|md5sum|head -c 10; echo; done 
#    csvpipe_field = f 
    csvpipe_field_string = f 
} 

index idx1 { 
    type = plain 
    source = src 
    path = idx1 
    dict = keywords 
    access_plain_attrs = mlock 
    access_blob_attrs = mlock 
    access_doclists = mlock 
    access_hitlists = mlock 
    min_infix_len = 2 
#    stored_fields = f 
} 

index idx2:idx1 { 
    path = idx2 
} 

index idx3:idx1 { 
    path = idx3 
} 

index idx4:idx1 { 
    path = idx4 
} 

index dist { 
    type = distributed 
    local = idx1 
    local = idx2 
    local = idx3 
    local = idx4 
} 

index distp { 
    type = distributed 
    agent_persistent = localhost:9316:dist 
    ha_strategy      = nodeads 
} 

searchd { 
    listen = 127.0.0.1:9315:mysql41 
    listen = 127.0.0.1:9316 
    log = sphinx_min.log 
    pid_file = /home/snikolaev/9315.pid 
    binlog_path = 
    qcache_max_bytes = 0 
} 
mysql> SELECT Id, SNIPPET(f,  QUERY()) FROM distp WHERE MATCH('*ab*') limit 0; show meta; 
Empty set (0.01 sec) 

+---------------+-------+ 
| Variable_name | Value | 
+---------------+-------+ 
| total         | 1000  | 
| total_found   | 10700 | 
| time          | 0.010 | 
| keyword[0]    | *ab*  | 
| docs[0]       | 13700 | 
| hits[0]       | 13700 | 
+---------------+-------+ 
6 rows in set (0.00 sec) 

mysql> SELECT Id, SNIPPET(f,  QUERY()) FROM dist WHERE MATCH('*ab*') limit 0; show meta; 
Empty set (0.01 sec) 

+---------------+-------+ 
| Variable_name | Value | 
+---------------+-------+ 
| total         | 1000  | 
| total_found   | 10700 | 
| time          | 0.004 | 
| keyword[0]    | *ab*  | 
| docs[0]       | 13700 | 
| hits[0]       | 13700 | 
+---------------+-------+ 
6 rows in set (0.00 sec) 

Please provide a reproducible case. Feel free to upload your indexes and config to our ftp - https://mnt.cr/ftp

daikoz commented 3 years ago

I upload to FTP 2 files:

You can reproduce the issue on Debian 10 and Manticore 3.6.0 96d61d8bf@210504 release

For test you can modify /etc/hosts to redirect SERVERX to localhost: agent_persistent = SERVER1:9312|SERVER2:9312|SERVER3:9312|SERVER4:9312:WebPages

sanikolaev commented 3 years ago

Thank you! I could reproduce the issue on our side. I could also reproduce:

mysql> SELECT Id, SNIPPET(Body, QUERY()) FROM WebPagesLB WHERE MATCH('modele');
ERROR 1064 (42000): index WebPagesLB: agent localhost:9312: agent has 32-bit docids; no longer supported
githubmanticore commented 1 year ago

➤ Aleksey N. Vinogradov commented:

That is because of implicit limit for remotes. For local agents by default limit is 20. For remotes it is 1000. So, when you query the balancer - it sends request to a mirror with internal max_matches=1000. Then it retrieve ALL matches and return you 20 (or whatever limit is set). By default we're trained to deal with aggregations - so if you want something like avg() over several different agents, or even count/count(distinct) - we need many matches to be precise. But the same codepath is in game even for single mirror, where such behavior looks too cruel.