heavyai / heavydb

HeavyDB (formerly OmniSciDB)
https://heavy.ai
Apache License 2.0
2.92k stars 444 forks source link

"Hash tables with more than 2B entries not supported yet" issue #382

Open titawork opened 4 years ago

titawork commented 4 years ago

Table C : 40000 rows Table D : 71969 rows Table E : 200 rows Table F : 757 rows

  1. This SQL works well when A: 10,000,000 rows B: 100,000,000 rows

  2. Increace A to 100,000,000 rows "Hash tables with more than 2B entries not supported yet" issue

  3. use delete from A where ... to back to 10,000,000 rows "Hash tables with more than 2B entries not supported yet" issue even after restart mapd

  4. truncate table A and then insert into A select * from ACopy insert 10,000,000 rows into A. It works well.

So, Why this issue happend? And it seems I don't have 2B entries in the table. Could you kindly teach me what's the meaning of "2B entries" ? What can I do to avoid this issue ? Thanks!

OmniSci Server Version: 4.7.1-20190628-5bae2da063

SQL: CREATE TABLE business_m AS SELECT A.e_type, A.m_type, A.r_account, A.scode, A.c_s_type, A.e_kind, A.s_no, A.t_id, A.b_id, A.b_date, A.b_time, A.b_price, A.s_price, A.b_amount, A.s_amount, A.b_balance, A.s_balance, A.c_balance, A.s_date, A.r_code, A.b_remark, A.f_fare1, A.f_fare2, A.f_fare3, A.f_fare4, A.f_fare5, A.f_farex, B.b_no, B.op_B_way as B_way, B.r_time, B.f_kind, B.B_type, B.B_no, B.room_code, B.f_account, B.s_account, B.B_prop, B.order_id, B.B_bs, C.f_k_str, C.d_model, D.s_type, D.s_unit from A, B, C, D, E, F where A.seat_no=F.seat_no and A.B_bs =B.B_bs and A.r_account=B.r_account and A.s_code=B.s_code and A.e_type=B.e_type and B.f_account=C.f_account and A.s_code=D.s_code and E.b_no=C.b_no

cdessanti commented 4 years ago

this old post could help

https://github.com/omnisci/omniscidb/issues/197

I think the problem is the increased cardinality between two tables you are trying to join that leads to the error you are getting.

titawork commented 4 years ago

In my new testing, select count(*) from A, B, C, D, E, F where A.seat_no=F.seat_no and A.B_bs =B.B_bs and A.r_account=B.r_account and A.s_code=B.s_code and A.e_type=B.e_type and B.f_account=C.f_account and A.s_code=D.s_code and E.b_no=C.b_no

Keep table B : 0.1 bilion rows Add rows into A with step of 10,000,000 rows. when table A : 10,000,000 rows. Get count is 10,000,000 when table A : 20,000,000 rows. Get count is 20,000,000 when table A : 30,000,000 rows. Get count is 30,000,000 when table A : 40,000,000 rows. Get count is 40,000,000 when table A : 50,000,000 rows. should get 50,000,000, but Get nothing. "Hash tables with more than 2B entries not supported yet" issue

Could you tell me what's the "2B entries" ? Thanks!

cdessanti commented 4 years ago

I guess the hash entries needed to join table a and table b. Could you set the parameter verbose to true and post the log with 40m entries on table a. In the log there would be the number of entries and memory needed by hash table

titawork commented 4 years ago

Thank you ! I guess there must be something bug here. Because with all the same tables except A, sometimes table A : 40,000,000 rows. works well, somethimes not . sometimes even 10 rows in table A also have "Hash tables with more than 2B entries not supported yet" issue. I haven't find out traces to follow yet, but it can be reproduced in my machines.