arangodb / arangodb

🥑 ArangoDB is a native multi-model database with flexible data models for documents, graphs, and key-values. Build high performance applications using a convenient SQL-like query language or JavaScript extensions.
https://www.arangodb.com
Other
13.49k stars 834 forks source link

[Arangodb3.4 on 4-node cluster]My graph query with hop[1..2] and multiple startvertex and LIMIT 1000 is very slow #9040

Closed pocketwalker closed 5 years ago

pocketwalker commented 5 years ago

Pls do let me know if any further information is needed to help you understand my case:-)

AQL query (if applicable): WITH mio_company, mio_people, mio_tag FOR greatstart IN ['mio_company/29fa8d1eb5c6951812eeac8a8930064a','mio_people/29fa8d1eb5c6951812eeac8a8930064a','mio_tag/29fa8d1eb5c6951812eeac8a8930064a','mio_company/b34b3ea1785359b0c20b1d3614024546','mio_people/b34b3ea1785359b0c20b1d3614024546','mio_tag/b34b3ea1785359b0c20b1d3614024546'] FOR v, e, p IN 1..2 ANY greatstart GRAPH 'mio_graph' FILTER e.type == 'KNOWLEDGE_GRAPH_EDGE_TYPE_PEOPLE_FROM_COMPANY' OR e.type == 'KNOWLEDGE_GRAPH_EDGE_TYPE_PEOPLE_FAMILY_RELATIONS' OR e.type == 'KNOWLEDGE_GRAPH_EDGE_TYPE_PEOPLE_BOARD_MEMBER_FROM_COMPANY' OR e.type == 'KNOWLEDGE_GRAPH_EDGE_TYPE_PEOPLE_KEY_EXECUTIVE_FROM_COMPANY' OR e.type == 'KNOWLEDGE_GRAPH_EDGE_TYPE_COMPANY_PARENT_COMPANY' OR e.type == 'KNOWLEDGE_GRAPH_EDGE_TYPE_COMPANY_INVEST_COMPANY' OR e.type == 'KNOWLEDGE_GRAPH_EDGE_TYPE_COMPANY_SUPPLIER_TO_DISTRIBUTOR' OR e.type == 'KNOWLEDGE_GRAPH_EDGE_TYPE_COMPANY_CREDITOR_TO_BORROWER' OR e.type == 'KNOWLEDGE_GRAPH_EDGE_TYPE_COMPANY_VENDOR_TO_CLIENT' OR e.type == 'KNOWLEDGE_GRAPH_EDGE_TYPE_COMPANY_COMPETITOR_COMPANY' OR e.type == 'KNOWLEDGE_GRAPH_EDGE_TYPE_COMPANY_HOLDS_INVESTMENT_ARM' OR e.type == 'KNOWLEDGE_GRAPH_EDGE_TYPE_COMPANY_MERGED_ENTITY' OR e.type == 'KNOWLEDGE_GRAPH_EDGE_TYPE_IS_KEY_MEMBER_OF_COMPANY' OR e.type == 'KNOWLEDGE_GRAPH_EDGE_TYPE_IS_SUPERVISORY_BOARD_MEMBER_OF_COMPANY' OR e.type == 'KNOWLEDGE_GRAPH_EDGE_TYPE_IS_BRANCH_OF_COMPANY' OR e.type == 'KNOWLEDGE_GRAPH_EDGE_TYPE_IS_SHAREHOLDER_OF_COMPANY' OR e.type == 'KNOWLEDGE_GRAPH_EDGE_TYPE_IS_LEGAL_PERSON_OF_COMPANY' OR e.type == 'KNOWLEDGE_GRAPH_EDGE_TYPE_COMPANY_BELONGS_TO_INDUSTRY' OR e.type == 'KNOWLEDGE_GRAPH_EDGE_TYPE_INDUSTRY_DOWNSTREAM_OF_INDUSTRY' FILTER v._id == 'mio_company/29fa8d1eb5c6951812eeac8a8930064a' OR v._id == 'mio_people/29fa8d1eb5c6951812eeac8a8930064a' OR v._id == 'mio_tag/29fa8d1eb5c6951812eeac8a8930064a' OR v._id == 'mio_company/b34b3ea1785359b0c20b1d3614024546' OR v._id == 'mio_people/b34b3ea1785359b0c20b1d3614024546' OR v._id == 'mio_tag/b34b3ea1785359b0c20b1d3614024546' LIMIT 1000 RETURN p

AQL explain (if applicable): Query String: WITH mio_company, mio_people, mio_tag FOR greatstart IN ['mio_company/29fa8d1eb5c6951812eeac8a8930064a','mio_people/29fa8d1eb5c6951812eeac8a8930064a','mio_tag/29fa8d1eb5c6951812eeac8a8930064a','mio_company/b34b3ea1785359b0c20b1d3614024546','mio_people/b34b3ea1785359b0c20b1d3614024546','mio_tag/b34b3ea1785359b0c20b1d3614024546'] FOR v, e, p IN 1..2 ANY greatstart GRAPH 'mio_graph' FILTER e.type == 'KNOWLEDGE_GRAPH_EDGE_TYPE_PEOPLE_FROM_COMPANY' OR e.type == 'KNOWLEDGE_GRAPH_EDGE_TYPE_PEOPLE_FAMILY_RELATIONS' OR e.type == 'KNOWLEDGE_GRAPH_EDGE_TYPE_PEOPLE_BOARD_MEMBER_FROM_COMPANY' OR e.type == 'KNOWLEDGE_GRAPH_EDGE_TYPE_PEOPLE_KEY_EXECUTIVE_FROM_COMPANY' OR e.type == 'KNOWLEDGE_GRAPH_EDGE_TYPE_COMPANY_PARENT_COMPANY' OR e.type == 'KNOWLEDGE_GRAPH_EDGE_TYPE_COMPANY_INVEST_COMPANY' OR e.type == 'KNOWLEDGE_GRAPH_EDGE_TYPE_COMPANY_SUPPLIER_TO_DISTRIBUTOR' OR e.type == 'KNOWLEDGE_GRAPH_EDGE_TYPE_COMPANY_CREDITOR_TO_BORROWER' OR e.type == 'KNOWLEDGE_GRAPH_EDGE_TYPE_COMPANY_VENDOR_TO_CLIENT' OR e.type =...

Execution plan: Id NodeType Site Est. Comment 1 SingletonNode COOR 1 ROOT 2 CalculationNode COOR 1 - LET #4 = [ "mio_company/29fa8d1eb5c6951812eeac8a8930064a", "mio_people/29fa8d1eb5c6951812eeac8a8930064a", "mio_tag/29fa8d1eb5c6951812eeac8a8930064a", "mio_company/b34b3ea1785359b0c20b1d3614024546", "mio_people/b34b3ea1785359b0c20b1d3614024546", "mio_tag/b34b3ea1785359b0c20b1d3614024546" ] / json expression / / const assignment / 3 EnumerateListNode COOR 6 - FOR greatstart IN #4 / list iteration / 4 TraversalNode COOR 36100 - FOR v / vertex /, e / edge /, p / paths / IN 1..2 / min..maxPathDepth / ANY greatstart / startnode / GRAPH 'mio_graph' 12 CalculationNode COOR 36100 - LET #8 = ((e.type in SORTED_UNIQUE([ "KNOWLEDGE_GRAPH_EDGE_TYPE_PEOPLE_FROM_COMPANY", "KNOWLEDGE_GRAPH_EDGE_TYPE_PEOPLE_FAMILY_RELATIONS", "KNOWLEDGE_GRAPH_EDGE_TYPE_PEOPLE_BOARD_MEMBER_FROM_COMPANY", "KNOWLEDGE_GRAPH_EDGE_TYPE_PEOPLE_KEY_EXECUTIVE_FROM_COMPANY", "KNOWLEDGE_GRAPH_EDGE_TYPE_COMPANY_PARENT_COMPANY", "KNOWLEDGE_GRAPH_EDGE_TYPE_COMPANY_INVEST_COMPANY", "KNOWLEDGE_GRAPH_EDGE_TYPE_COMPANY_SUPPLIER_TO_DISTRIBUTOR", "KNOWLEDGE_GRAPH_EDGE_TYPE_COMPANY_CREDITOR_TO_BORROWER", "KNOWLEDGE_GRAPH_EDGE_TYPE_COMPANY_VENDOR_TO_CLIENT", "KNOWLEDGE_GRAPH_EDGE_TYPE_COMPANY_COMPETITOR_COMPANY", "KNOWLEDGE_GRAPH_EDGE_TYPE_COMPANY_HOLDS_INVESTMENT_ARM", "KNOWLEDGE_GRAPH_EDGE_TYPE_COMPANY_MERGED_ENTITY", "KNOWLEDGE_GRAPH_EDGE_TYPE_IS_KEY_MEMBER_OF_COMPANY", "KNOWLEDGE_GRAPH_EDGE_TYPE_IS_SUPERVISORY_BOARD_MEMBER_OF_COMPANY", "KNOWLEDGE_GRAPH_EDGE_TYPE_IS_BRANCH_OF_COMPANY", "KNOWLEDGE_GRAPH_EDGE_TYPE_IS_SHAREHOLDER_OF_COMPANY", "KNOWLEDGE_GRAPH_EDGE_TYPE_IS_LEGAL_PERSON_OF_COMPANY", "KNOWLEDGE_GRAPH_EDGE_TYPE_COMPANY_BELONGS_TO_INDUSTRY", "KNOWLEDGE_GRAPH_EDGE_TYPE_INDUSTRY_DOWNSTREAM_OF_INDUSTRY" ])) && (v._id in [ "mio_company/29fa8d1eb5c6951812eeac8a8930064a", "mio_people/29fa8d1eb5c6951812eeac8a8930064a", "mio_tag/29fa8d1eb5c6951812eeac8a8930064a", "mio_company/b34b3ea1785359b0c20b1d3614024546", "mio_people/b34b3ea1785359b0c20b1d3614024546", "mio_tag/b34b3ea1785359b0c20b1d3614024546" ])) / simple expression */ 6 FilterNode COOR 36100 - FILTER #8 9 LimitNode COOR 1000 - LIMIT 0, 1000 10 ReturnNode COOR 1000 - RETURN p

Indexes used: By Type Collection Unique Sparse Selectivity Fields Ranges 4 edge mio_company_belongs_to_industry false false n/a [ _to ] base INBOUND 4 edge mio_company_belongs_to_industry false false n/a [ _from ] base OUTBOUND 4 edge mio_company_competitor_company false false n/a [ _to ] base INBOUND 4 edge mio_company_competitor_company false false n/a [ _from ] base OUTBOUND 4 edge mio_company_creditor_to_borrower false false n/a [ _to ] base INBOUND 4 edge mio_company_creditor_to_borrower false false n/a [ _from ] base OUTBOUND 4 edge mio_company_holds_investment_arm false false n/a [ _to ] base INBOUND 4 edge mio_company_holds_investment_arm false false n/a [ _from ] base OUTBOUND 4 edge mio_company_invest_company false false n/a [ _to ] base INBOUND 4 edge mio_company_invest_company false false n/a [ _from ] base OUTBOUND 4 edge mio_company_merged_entity false false n/a [ _to ] base INBOUND 4 edge mio_company_merged_entity false false n/a [ _from ] base OUTBOUND 4 edge mio_company_parent_company false false n/a [ _to ] base INBOUND 4 edge mio_company_parent_company false false n/a [ _from ] base OUTBOUND 4 edge mio_company_supplier_to_distributor false false n/a [ _to ] base INBOUND 4 edge mio_company_supplier_to_distributor false false n/a [ _from ] base OUTBOUND 4 edge mio_industry_downstream_of_industry false false n/a [ _to ] base INBOUND 4 edge mio_industry_downstream_of_industry false false n/a [ _from ] base OUTBOUND 4 edge mio_is_branch_of_company false false n/a [ _to ] base INBOUND 4 edge mio_is_branch_of_company false false n/a [ _from ] base OUTBOUND 4 edge mio_is_key_member_of_company false false n/a [ _to ] base INBOUND 4 edge mio_is_key_member_of_company false false n/a [ _from ] base OUTBOUND 4 edge mio_is_legal_person_of_company false false n/a [ _to ] base INBOUND 4 edge mio_is_legal_person_of_company false false n/a [ _from ] base OUTBOUND 4 edge mio_is_shareholder_of_company false false n/a [ _to ] base INBOUND 4 edge mio_is_shareholder_of_company false false n/a [ _from ] base OUTBOUND 4 edge mio_is_supervisory_board_member_of_company false false n/a [ _to ] base INBOUND 4 edge mio_is_supervisory_board_member_of_company false false n/a [ _from ] base OUTBOUND 4 edge mio_people_board_member_from_company false false n/a [ _to ] base INBOUND 4 edge mio_people_board_member_from_company false false n/a [ _from ] base OUTBOUND 4 edge mio_people_family_relations false false n/a [ _to ] base INBOUND 4 edge mio_people_family_relations false false n/a [ _from ] base OUTBOUND 4 edge mio_people_key_executive_from_company false false n/a [ _to ] base INBOUND 4 edge mio_people_key_executive_from_company false false n/a [ _from ] base OUTBOUND

Functions used: Name Deterministic Cacheable Uses V8 SORTED_UNIQUE true true false

Traversals on graphs: Id Depth Vertex collections Edge collections Options Filter conditions 4 1..2 mio_company, mio_people, mio_tag mio_company_belongs_to_industry, mio_company_competitor_company, mio_company_creditor_to_borrower, mio_company_holds_investment_arm, mio_company_invest_company, mio_company_merged_entity, mio_company_parent_company, mio_company_supplier_to_distributor, mio_industry_downstream_of_industry, mio_is_branch_of_company, mio_is_key_member_of_company, mio_is_legal_person_of_company, mio_is_shareholder_of_company, mio_is_supervisory_board_member_of_company, mio_people_board_member_from_company, mio_people_family_relations, mio_people_key_executive_from_company uniqueVertices: none, uniqueEdges: path

Optimization rules applied: Id RuleName 1 move-calculations-up 2 move-filters-up 3 move-calculations-up-2 4 move-filters-up-2 5 replace-or-with-in 6 sort-in-values 7 fuse-filters

jsteemann commented 5 years ago

@pocketwalker: There are a few things that come to my mind here:

pocketwalker commented 5 years ago

@jsteemann thanks. I just made some changes on FILTERs and removed all edgeType checks, but it's still too slow. Here's my query, WITH mio_company, mio_people, mio_tag FOR greatstart IN ['mio_company/29fa8d1eb5c6951812eeac8a8930064a', 'mio_company/b34b3ea1785359b0c20b1d3614024546'] FOR v, e, p IN 1..2 ANY greatstart GRAPH 'mio_graph' FILTER p.vertices[0]._id == 'mio_company/29fa8d1eb5c6951812eeac8a8930064a' OR p.vertices[0]._id== 'mio_company/b34b3ea1785359b0c20b1d3614024546' FILTER p.vertices[2]._id == 'mio_company/29fa8d1eb5c6951812eeac8a8930064a' OR p.vertices[2]._id == 'mio_company/b34b3ea1785359b0c20b1d3614024546' LIMIT 1000 RETURN p --Are the filters on the vertices actually necessary given that these ids are also the start vertices' ids? Yes if my understanding is correct. I want to get all paths whose v[0] and v[2] (1..2 relationship) are in the given nodeSet. Any idea?

jsteemann commented 5 years ago

@pocketwalker : none apart from the ones I summarized above.

pocketwalker commented 5 years ago

Thanks for your reply! It can be closed now.

dothebart commented 5 years ago

Closing as solved.