cube-js / cube

📊 Cube — The Semantic Layer for Building Data Applications
https://cube.dev
Other
17.96k stars 1.78k forks source link

Join the same cube twice in one view #8896

Open oxoxlol opened 2 weeks ago

oxoxlol commented 2 weeks ago

Problem

I've asked this question on stackoverflow (https://stackoverflow.com/questions/79044418/transitive-join-to-an-entity-referenced-by-several-other-entities) before, but didn't get an answer. I decided to duplicate the question here.

I have this schema E-R-Schema-of-TPC-H-Benchmark-2897370714

I build a view based on the model described above:

views:
   - name: view_tpch
     public: true

     cubes:
         - join_path: tpch_lineitem
           includes:
              - L_SUPPKEY

         - join_path: tpch_orders
           includes:
               - O_CUSTKEY

         - join_path: tpch_supplier
           includes: 
               - S_SUPPKEY
               - S_NAME
               - S_NATIONKEY

         - join_path: tpch_customer
           includes: 
               - C_CUSTKEY
               - C_NAME
               - C_NATIONKEY

         - join_path: tpch_lineitem.tpch_partsupp.tpch_supplier.tpch_nation
           alias: supp_nation
           prefix: true
           includes:
              - name: N_NATIONKEY
                alias: supp_N_NATIONKEY
              - name: N_NAME
                alias: supp_N_NAME

         - join_path: tpch_lineitem.tpch_orders.tpch_customer.tpch_nation
           alias: cust_nation
           prefix: true
           includes:
              - name: N_NATIONKEY
                alias: cust_N_NATIONKEY
              - name: N_NAME
                alias: cust_N_NAME

The problem is that when cube nation references 2 cubes: 1) customer 2) supplier

In the view I want to display the supplier nation and customer nation at the same time.

However, for some reason, the nation cube is always jojointed only once. And always only for the first join_path. The direction of the joins is done using “dot notation”, but this only helps to select the correct path for the first join. The second one is ignored.

I thought that these issues would help me somehow in my question, but they did not: 1) https://github.com/cube-js/cube/issues/438 2) https://github.com/cube-js/cube/issues/8499

Please suggest how I can solve my problem.

Related Cube.js generated SQL

SELECT
"tpch_lineitem".L_SUPPKEY "view_tpch___l__s_u_p_p_k_e_y", 
"tpch_orders".O_CUSTKEY "view_tpch___o__c_u_s_t_k_e_y", 
"tpch_supplier".S_SUPPKEY "view_tpch___s__s_u_p_p_k_e_y", 
"tpch_supplier".S_NAME "view_tpch___s__n_a_m_e", 
"tpch_supplier".S_NATIONKEY "view_tpch___s__n_a_t_i_o_n_k_e_y", 
"tpch_customer".C_CUSTKEY "view_tpch___c__c_u_s_t_k_e_y", 
"tpch_customer".C_NAME "view_tpch___c__n_a_m_e", 
"tpch_customer".C_NATIONKEY "view_tpch___c__n_a_t_i_o_n_k_e_y", 
"tpch_nation".N_NATIONKEY "view_tpch__supp_nation_supp__n__n_a_t_i_o_n_k_e_y", 
"tpch_nation".N_NAME "view_tpch__supp_nation_supp__n__n_a_m_e", 
"tpch_nation".N_NATIONKEY "view_tpch__cust_nation_cust__n__n_a_t_i_o_n_k_e_y", 
"tpch_nation".N_NAME "view_tpch__cust_nation_cust__n__n_a_m_e"
    FROM
      (select 
     L_ORDERKEY
   , L_PARTKEY
   , L_SUPPKEY
   , L_LINENUMBER
   , L_QUANTITY
   , L_EXTENDEDPRICE
   , L_DISCOUNT
   , L_TAX
   , L_RETURNFLAG
   , L_LINESTATUS
   , L_SHIPDATE
   , L_COMMITDATE
   , L_RECEIPTDATE
   , L_SHIPINSTRUCT
   , L_SHIPMODE
   , L_COMMENT
from sb_ylw_ukd.tpch_lineitem limit 100000
) AS "tpch_lineitem"
LEFT JOIN (select 
     O_ORDERKEY
   , O_CUSTKEY
   , O_ORDERSTATUS
   , O_TOTALPRICE
   , O_ORDERDATE
   , O_ORDERPRIORITY
   , O_CLERK
   , O_SHIPPRIORITY
   , O_COMMENT
   from sb_ylw_ukd.tpch_orders
   limit 10000
) AS "tpch_orders" ON "tpch_lineitem".L_ORDERKEY = "tpch_orders".O_ORDERKEY
LEFT JOIN (select 
     PS_PARTKEY
   , PS_SUPPKEY
   , PS_AVAILQTY
   , PS_SUPPLYCOST
   , PS_COMMENT
   from sb_ylw_ukd.tpch_partsupp 
   limit 100000

) AS "tpch_partsupp" ON "tpch_lineitem".L_PARTKEY = "tpch_partsupp".PS_PARTKEY and "tpch_lineitem".L_SUPPKEY = "tpch_partsupp".PS_SUPPKEY

LEFT JOIN (select 
   S_SUPPKEY
 , S_NAME
 , S_ADDRESS
 , S_NATIONKEY
 , S_PHONE
 , S_ACCTBAL
 , S_COMMENT
from sb_ylw_ukd.tpch_supplier 
) AS "tpch_supplier" ON "tpch_partsupp".PS_SUPPKEY = "tpch_supplier".S_SUPPKEY
LEFT JOIN (select 
  C_CUSTKEY
, C_NAME , C_ADDRESS , C_NATIONKEY , C_PHONE , C_ACCTBAL , C_MKTSEGMENT , C_COMMENT  from sb_ylw_ukd.tpch_customer 
) AS "tpch_customer" ON "tpch_orders".O_CUSTKEY = "tpch_customer".C_CUSTKEY
LEFT JOIN (select 
   N_NATIONKEY
 , N_NAME
 , N_REGIONKEY
 , N_COMMENT
from sb_ylw_ukd.tpch_nation  order by N_NATIONKEY
) AS "tpch_nation" ON "tpch_supplier".S_NATIONKEY = "tpch_nation".N_NATIONKEY  WHERE ("tpch_supplier".S_SUPPKEY IS NOT NULL) AND ("tpch_customer".C_CUSTKEY IS NOT NULL) GROUP BY 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 LIMIT 1000000

My SQL query in Dbeaver

select * from view_tpch
where S_SUPPKEY is not null and C_CUSTKEY is not null

P.S. I would also like to take this opportunity to note that I would also be happy to “vote” in favor of extending the naming to other languages .Referring to the issues 1) https://github.com/cube-js/cube/issues/3238 2) https://github.com/cube-js/cube/issues/7665 Although I understand that Localization is currently outside of Cube Core's scope, but still I would like to mention that as a member of your community I am very interested in this feature.

oxoxlol commented 1 day ago

I' sorry for being intrusive.

But maybe we should add another tag to this issue?

It's both “enhancement” and “help wanted” and maybe “bug”, I'm not sure.

But I really wanted to get feedback so I know that the request won't go to the bottom without a response.