Restream / reindexer

Embeddable, in-memory, document-oriented database with a high-level Query builder interface.
https://reindexer.io
Apache License 2.0
763 stars 64 forks source link

Performance drop on excessive number of IN() set conditions. #62

Open oruchreis opened 4 years ago

oruchreis commented 4 years ago

Hi, I'm using Reindexer a few months without any performance issues. And I'm quite happy with query performance especially for queries that have IN(set) conditions. But lately, some namespaces growed more in these months, and I'm facing some performance issues on specific queries on big namespaces which have items about 1 million. I've figured out if the count of values in the set condition is smaller from a limit it executes index method, if it is bigger then this limit, it executes scan method which is very slow. And I think this limit is changing by items count in the namespace. I've uploaded dump file of the namespace to reproduce this issue. For this specific namespace, If the set condition has 679 values, it executes index method, but if the count of values is 680 then it executes scan method which results slower query. Also I've figured out if it executes scan method, the current activities page in the face displays as select_loop which takes about 30-50sn. And if the count of this scan queries increased over time, then cpu usage are also increasing exponentially. I've tested on v2.6.2, 2.9.1, 2.11.0, 2.11.1 and 3.0.0 on both linux and windows. Here is the dump of namespace to reproduce the issue: TestNs.zip

And the query which executes scan method instead of index method. It executed scan in v3.0.0 and prior versions but if it doesn't execute scan method you can increase amount of values in the IN condition.

EXPLAIN SELECT * FROM TestNs
WHERE  (QuotaDate >= 637352928000000000 AND QuotaDate <= 637356384000000000) AND  
 HotelRoomId IN (
'9580','9581','9582','15461','9165','8107','15465','9594','9595','9596','9597','15466','9602','15467','9603','15468','15469','15470','15471','9604','15472','9605','15473','9606','15474','15475','15476','9607','15477','15478','9166',
'9187','9612','9613','15487','9617','9618','9794','9619','8522','9620','15488','15489','15490','15491','15492','15493','9824','9825','9828','9636','9630','9631','9829','9632','9639','8061','8053','8054','15510','15513','15514',
'9661','9662','15520','9541','15521','9877','8059','8060','9667','15523','15524','15525','15527','15528','15529','9671','16579','16580','16581','15530','16582','16583','15531','9668','16584','15526','9672','9669','14841','9673',
'9670','9674','9845','9675','15538','9676','9677','15539','13766','9678','13767','19438','15540','13768','19439','9679','9904','8290','8443','9683','9684','9685','9686','9687','15544','9688','15543','9689','23143','15545','15546',
'23141','9694','15551','9451','9695','15552','9696','15553','15554','15555','8128','8129','14936','8130','8114','8115','8116','8120','15556','15557','9702','9701','15559','9703','8602','9704','15560','9705','9706','15561','9707',
'15563','15564','15565','9714','8395','8021','8022','8023','9708','8029','8031','9709','9710','13289','13290','19483','19484','15576','9719','9720','9718','15575','9721','9722','14601','11329','11336','11332','9723','15582',
'9724','9725','9156','9726','15588','9729','15589','9730','9731','15591','15592','8709','9751','8306','15612','15613','15614','11350','10198','15615','9753','9754','9756','8452','8435','15621','9757','15623','23247','23248',
'15622','15624','15626','15627','15625','9764','9765','9766','9767','9768','9769','13366','9770','9771','9772','9773','19437','9869','9776','9777','9779','15639','9780','15640','8953','9781','9782','9783','9784','8954',
'8952','8669','9785','8667','8666','9786','8670','9787','8673','9788','9789','8674','8672','9790','9791','8671','9792','8987','19577','9793','9971','19578','15643','9795','9796','9797','9511','9512','15656','17807','16923','13259',
'19600','20408','19601','9805','19602','9806','23320','23321','23322','23323','23324','23325','23326','23328','23327','9406','23329','8311','8466','9408','15678','15679','15680','15683','15681','15682','15686','15687','9817',
'15688','15689','15690','15691','9818','15692','15693','15694','9819','15695','9820','15696','9821','8625','9822','9823','15697','14213','9826','9827','8511','15705','15704','15703','8185','8186','8187','8190','9844','10188',
'10189','10190','10191','9851','9852','9853','9854','9855','9856','9859','9444','9863','15736','9866','9178','9180','15741','9870','9871','15743','15744','15745','15746','9874','15747','15748','9875','9410','8854','8088','9883',
'9884','9885','9858','19725','9888','9872','9399','15760','9895','15761','23474','9893','15762','15763','9894','9892','23479','9905','9906','9907','9908','9903','9909','9910','9911','9912','15778','15781','15782','15783','15784',
'9922','9923','10151','9936','9949','9950','9967','9968','9969','19784','15804','15805','13635','9301','15811','15812','15813','19809','15816','15815','15817','15818','15819','15822','9981','9982','15823','10008','15824','15825',
'15826','10012','10013','19850','19851','19852','10019','19846','10020','10021','19855','15829','19853','15830','19849','15831','19854','19838','13633','19839','13634','19840','19841','19842','19843','19830','19831','19832','19833',
'14789','19856','15834','19834','19835','11346','19836','11347','11348','19837','8717','8723','10046','10047','19864','10057','23622','23621','10059','23620','10060','10061','10076','10077','10078','10079','10080','10081','10082',
'10083','15879','23641','10128','15880','15881','15882','15883','9266','15884','15885','15886','15887','8358','8143','19921','19923','19924','8715','8716','10135','15902','10136','8904','8905','8906','8999','8937','9000','8938',
'10139','9001','8939','8945','8837','14800','8838','10140','8839','8872','14799','8873','10142','10143','10144','10145','9267','15915','15916','15917','14938','14939','8407','8408','19946','8062','19947','19948','9556','10177',
'8413','15934','10179','15935','10181','10182','10183','10184','10185','10186','15937','15940','10187','15941','19965','10192','10193','10194','19966','10195','19967','9433','9464','12928','8513','10205','10206','10137','10138',
'10141','10209','10210','10211','15961','10212','10152','8855','10217','10218','17184','20028','16450','8578','15985','15986','20056','20057','8733','20058','20059','8543','15991','20062','15992','8230','15993','15995','12558',
'20069','14766','14768','15996','15997','14779','15998','15999','20077','16000','20078','16001','20080','16002','20081','16008','9077','9078','9079','16011','16012','16015','16014','16013','23871','20130','7950','7951','7952',
'8044','8045','8049','12528','16047','16048','16049','16050','16051','14053','9445','11213','16060','11214','16063','16064','16065','11217','11218','16066','16067','16068','16069','16070','16071','16072','16073','16076','16077',
'16078','23959','23960','9513','20187','20188','9924','16088','16087','16490','20197','16089','20198','12691','16090','16091','16092','11276','11278','11287','16093','11288','11289','11280','15318','11286','23979','9226','11246',
'11247','16101','16102','16103','20207','16104','13558','15462','13559','15463','13560','11245','16207','20208','11248','20209','11249','11250','11251','11252','16113','9442','11253','11254','16112','8591','11255','16111','16110',
'16109','9238','17212','11265','11266','16118','16119','11267','11268','16120','8786','8785','8787','8789','16132','8788','16133','16135','16134','15271','11339','8803','9539','9540','14797','12502','12503','11291','11292','11293',
'11294','8159','11295','11296','24026','11264','11297','11298','11300','11274','11301','11275','11365','11302','16565','11303','16567','16492','20261','24044','20263','11343','16152','11344','20264','11277','20262','16153','16154',
'11279','16860','16861','8510','11281','11307','11282','11283','11284','16166','11285','16167','16168','16169','24058','20286','24059','16165','11290','24060','17103','17104','24065','8012','11313','11314','8658','17712',
'16192','11315','16196','16197','16200','20310','16201','11316','20311','16202','20312','11317','11327','11318','16203','12570','12569','24086','11320','11322','11323','11330','11331','16215','11333','11334','16216','16217',
'11337','11338','16223','11340','11342','16225','16226','16227','11345','16228','20353','16229','7960','16230','11349','24131','20363','20364','20360','16237','20361','20362','20365','20366','20367','20368','8648','11366',
'16249','11367','16250','11362','11363','16251','16253','11373','23664','23665','16286','11400','11401','11402','8668','16308','16307','16309','12621','11364','20410','16316','20411','16318','11414','16317','11415','11416',
'24207','11417','11418','11420','11421','24209','11422','9157','9218','9219','9220','20425','11425','20426','15029','11426','20427','24217','16331','11427','11428','20432','20431','20430','20429','20428','8812','8811',
'24232','11432','11434','9059','9060','16349','16350','16351','16352','16353','16354','16355','16356','16357','20459','20460','20461','20462','8308','20463','16365','20465','20464','16366','14609','14610','16487','14518',
'20467','16377','14649','16388','16389','20502','20503','20504','20505','15270','8529','9283','24363','16886','20433','20434','17784','17782','17781','14210','14211','14212','24390','16430','8579','16556','16431',
'16432','20536','16433','20537','20538','16434','20539','16435','9534','12930','16444','9396','12463','16446','12464','12465','12466','20563','16448','12467','9371','16449','20561','9976','9977','16451','16452','8546',
'8547','8548','8549','8551','8552','8553','20564','8307','20570','20571','24452','16466','16467','24453','13619','16476','16475','12712','12710','16484','12498','12499','16485','16486','16488','16489','16491','17175','16493',
   '8219','12504','18000','24498','12505','12506','12507','12508','13695','24506','24507','24508','16506','16507','16508','9270','8675','8676','8677','8437','9386','9387','9878','9879','14617','16510','12529','16511',
'12530','8747','16512','12531','16513','20631','16515','16516','20632','16518','9458','9310','20633','16522','20079','12546','12547','12548','12549','16542','16691','12555','12556','12557','12561','12562','12563','12564',
'12565','8467','8468','12566','16551','16552','16553','12567','16554','12568','24562','16555','16557','16558','13398','16563','16564','16566','16568','12595','12596','12598','12599')
oruchreis commented 4 years ago

I've debugged for this query, and I've ended up again indexunordered.cc. As you mentioned here that increasing kMaxSelectivityPercentForIdset slows down some queries. So I didn't change this constant, instead I've added a threshold constant for the condition of choosing index or scan method. So it prevents scan method if the item count greater than this threshold.

https://github.com/oruchreis/reindexer/blob/c363c40d5f785ca8637ba5a4094a8cdfa130a0d8/cpp_src/core/index/indexunordered.cc#L195

I couldn't benchmarked in depth though, I don't know if it slowed down another query, but as far as I saw this workaround solved my performance issues on wide in() conditions for now. For this reason I didn't created a PR for this, because I don't know if it is a solution or not: https://github.com/Restream/reindexer/compare/master...oruchreis:set_cond_selectivity

oruchreis commented 3 years ago

Hi, have you able to look through this issue? We have splitted our queries because of this issue, big queries and small ones. if we send big queries to reindexer, it will become unresponsive soon. Thanks.

slowcheetahzzz commented 3 years ago

Hi, have you able to look through this issue? We have splitted our queries because of this issue, big queries and small ones. if we send big queries to reindexer, it will become unresponsive soon. Thanks.

Hello, @oruchreis. We've been quite busy developing other issues and didn't have any chance to take a look at your problem - we are quite sorry about that. I've just dubbed this issue in our local repository (we use gitlab) and we're going to fix it in the nearest future. Sorry for all the inconveniences and thank you for your patience.

olegator77 commented 3 years ago

Hi @oruchreis. We have investigated issue.

  1. The reason of original optimization for indexunordered, which have been introduced in 2.9.2, is following: when condition with many arguments of IN operator is secondary (e.g. another query conditions are high selectivity, and already drops results count to few tens or thousands), then use comparator for left tens-thousands of items is more efficient, than request to index, got many results and intersect them. Unfortunately request selectivity for choose method is not correct in all cases, e.g. like your case. To fix this there are serious changes it query optimizer are required - need to add second pass of optimization (current 1 pass implementation can't predict, selectivity of another query conditions). In next version we will add another factor for enable optimization - number of another conditions in query.

  2. The actual reason of performance problem with your query is bug in comparator - the IN for strings has O(N*M) complexity (M - number of arguments of IN). After fix it will be O(N) amortized. For your case without any another changes this fix will reduce time from 1500ms to 8ms