knizhnik / imcs

In-Memory Columnar Store extension for PostgreSQL
Apache License 2.0
203 stars 33 forks source link

table_delete delete more data than should #36

Closed amutu closed 10 years ago

amutu commented 10 years ago

the pg regression sql script is here:https://github.com/amutu/data/blob/master/delete.sql

the full .out file is here:https://github.com/amutu/data/blob/master/delete.out

select count(1) from crashlog where logtime >= '2014-04-16 09:50:06' and logtime <= '2014-04-16 09:50:30';

count

 0

(1 row)

select cs_create('crashlog','logtime','clientversion');

cs_create

(1 row)

select crashlog_load();

crashlog_load

      4000

(1 row)

select x.clientversion a ,cs_count(x.uin) b,cs_count(x.uin) c,cs_count(x.logtime) d, cs_count(x.class1) e, cs_count(x.class2) f, cs_count(x.class3) g ,cs_count(x.revision) h,cs_count(x.phoneid) i from wx_version,crashlog_get(clientversion) as x where x is not null; a | b | c | d | e | f | g | h | i
-----------+-----+-----+-----+-----+-----+-----+-----+----- 604307710 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 352321555 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 620888113 | 626 | 626 | 626 | 626 | 626 | 626 | 626 | 626 604176602 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 352321553 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 620888368 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 620888360 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 620888400 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 604176609 | 14 | 14 | 14 | 14 | 14 | 14 | 14 | 14 604307728 | 22 | 22 | 22 | 22 | 22 | 22 | 22 | 22 604176603 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 604307749 | 98 | 98 | 98 | 98 | 98 | 98 | 98 | 98 352321792 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 604307713 | 78 | 78 | 78 | 78 | 78 | 78 | 78 | 78 620888369 | 34 | 34 | 34 | 34 | 34 | 34 | 34 | 34 620888387 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 352321841 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 620888388 | 378 | 378 | 378 | 378 | 378 | 378 | 378 | 378 620822583 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 604307745 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 352321842 | 14 | 14 | 14 | 14 | 14 | 14 | 14 | 14 604307708 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 620888132 | 240 | 240 | 240 | 240 | 240 | 240 | 240 | 240 604307739 | 854 | 854 | 854 | 854 | 854 | 854 | 854 | 854 352321552 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 620888112 | 16 | 16 | 16 | 16 | 16 | 16 | 16 | 16 604176608 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 620888128 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 620822595 | 14 | 14 | 14 | 14 | 14 | 14 | 14 | 14 352452625 | 12 | 12 | 12 | 12 | 12 | 12 | 12 | 12 620757808 | 26 | 26 | 26 | 26 | 26 | 26 | 26 | 26 620757029 | 24 | 24 | 24 | 24 | 24 | 24 | 24 | 24 604111038 | 48 | 48 | 48 | 48 | 48 | 48 | 48 | 48 620757508 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 604307707 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 604111039 | 838 | 838 | 838 | 838 | 838 | 838 | 838 | 838 604307738 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 620888130 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 352322323 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 620757031 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 604307714 | 30 | 30 | 30 | 30 | 30 | 30 | 30 | 30 620888384 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 604307746 | 224 | 224 | 224 | 224 | 224 | 224 | 224 | 224 352452627 | 16 | 16 | 16 | 16 | 16 | 16 | 16 | 16 620757040 | 14 | 14 | 14 | 14 | 14 | 14 | 14 | 14 620757816 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 620888145 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 604111040 | 64 | 64 | 64 | 64 | 64 | 64 | 64 | 64 604176596 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 352322321 | 24 | 24 | 24 | 24 | 24 | 24 | 24 | 24 604307715 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 620822576 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 (52 rows)

select crashlog_delete(clientversion,'2014-04-16 09:50:06','2014-04-16 09:50:30') from wx_version;

crashlog_delete

           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1

(52 rows)

select x.clientversion a ,cs_count(x.uin) b,cs_count(x.uin) c,cs_count(x.logtime) d, cs_count(x.class1) e, cs_count(x.class2) f, cs_count(x.class3) g ,cs_count(x.revision) h,cs_count(x.phoneid) i from wx_version,crashlog_get(clientversion) as x where x is not null; a | b | c | d | e | f | g | h | i
-----------+-----+-----+-----+-----+-----+-----+-----+----- 604307710 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 352321555 | 19 | 19 | 19 | 19 | 19 | 19 | 19 | 19 620888113 | 625 | 625 | 625 | 625 | 625 | 625 | 625 | 625 604176602 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 352321553 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 620888368 | 9 | 9 | 9 | 9 | 9 | 9 | 9 | 9 620888360 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 620888400 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 604176609 | 13 | 13 | 13 | 13 | 13 | 13 | 13 | 13 604307728 | 21 | 21 | 21 | 21 | 21 | 21 | 21 | 21 604176603 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 604307749 | 97 | 97 | 97 | 97 | 97 | 97 | 97 | 97 352321792 | 99 | 99 | 99 | 99 | 99 | 99 | 99 | 99 604307713 | 77 | 77 | 77 | 77 | 77 | 77 | 77 | 77 620888369 | 33 | 33 | 33 | 33 | 33 | 33 | 33 | 33 620888387 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 352321841 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 620888388 | 377 | 377 | 377 | 377 | 377 | 377 | 377 | 377 620822583 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 604307745 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 352321842 | 13 | 13 | 13 | 13 | 13 | 13 | 13 | 13 604307708 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 620888132 | 239 | 239 | 239 | 239 | 239 | 239 | 239 | 239 604307739 | 853 | 853 | 853 | 853 | 853 | 853 | 853 | 853 352321552 | 35 | 35 | 35 | 35 | 35 | 35 | 35 | 35 620888112 | 15 | 15 | 15 | 15 | 15 | 15 | 15 | 15 604176608 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 620888128 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 620822595 | 13 | 13 | 13 | 13 | 13 | 13 | 13 | 13 352452625 | 11 | 11 | 11 | 11 | 11 | 11 | 11 | 11 620757808 | 25 | 25 | 25 | 25 | 25 | 25 | 25 | 25 620757029 | 23 | 23 | 23 | 23 | 23 | 23 | 23 | 23 604111038 | 47 | 47 | 47 | 47 | 47 | 47 | 47 | 47 620757508 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 604307707 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 604111039 | 837 | 837 | 837 | 837 | 837 | 837 | 837 | 837 604307738 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 620888130 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 352322323 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 620757031 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 604307714 | 29 | 29 | 29 | 29 | 29 | 29 | 29 | 29 620888384 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 604307746 | 223 | 223 | 223 | 223 | 223 | 223 | 223 | 223 352452627 | 15 | 15 | 15 | 15 | 15 | 15 | 15 | 15 620757040 | 13 | 13 | 13 | 13 | 13 | 13 | 13 | 13 620757816 | 19 | 19 | 19 | 19 | 19 | 19 | 19 | 19 620888145 | 9 | 9 | 9 | 9 | 9 | 9 | 9 | 9 604111040 | 63 | 63 | 63 | 63 | 63 | 63 | 63 | 63 604176596 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 352322321 | 23 | 23 | 23 | 23 | 23 | 23 | 23 | 23 604307715 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 620822576 | 9 | 9 | 9 | 9 | 9 | 9 | 9 | 9 (52 rows)

knizhnik commented 10 years ago

Thank you. Actually I already reproduced the problem and investigating it. Look like there is really some problem with delete.

But I do not understand what's wring with output of this regression test? Counters for all columns are the same before and after delete.

amutu commented 10 years ago

Counters for all columns are not always the same after delete or load.as #35 and #34 described. I try to make some script which can reproduce the bug.

knizhnik commented 10 years ago

There are actually two bugs in delete! First is that it doesn't check that from <= till. And second bug in RLE version of delete. Please try revision 64

amutu commented 10 years ago

the bug still exists,there is no change,except the cs_used_memory result changed for you change the default RLE setting.

knizhnik commented 10 years ago

Sorry, I do not understand your last statement. What is the relation between cs_used_memory and RLE setting? It is very very strange that you still reproduce the same problem. Can you sheck that you have really updated the library? Occasionally I have committed debug dump to file (delete.log). It should be in PostgreSQL home directory. Can you send it to me?

amutu commented 10 years ago

this is the delete.log: https://github.com/amutu/data/blob/master/delete.log

amutu commented 10 years ago

Sorry, I do not understand your last statement. What is the relation between cs_used_memory and RLE setting? ----------------- I see this change: -bool imcs_use_rle = true; +bool imcs_use_rle = false;

but in fact,the cs_use_memory result is changed by the translate(later named str2code/code2string) commit,sorry for my wrong assumption.

knizhnik commented 10 years ago

So, I do not see any inconsistencies in this log: number of records is the same for all attributes. Why do you think that the problem is still reproduced?

amutu commented 10 years ago

because when run this test file: https://github.com/amutu/data/blob/master/delete.sql

result is still this: https://github.com/amutu/data/blob/master/delete.out

amutu commented 10 years ago

more specific:

the pg table [ '2014-04-16 09:50:06' and logtime <= '2014-04-16 09:50:30';] has no data: select count(1) from crashlog where logtime >= '2014-04-16 09:50:06' and logtime <= '2014-04-16 09:50:30';

count

 0

but delete from cs, it delete 1 row for each clientversion: select crashlog_delete(clientversion,'2014-04-16 09:50:06','2014-04-16 09:50:30') from wx_version;

crashlog_delete

           1
           1
           1
           1
           1
           1

...

knizhnik commented 10 years ago

Sorry, which result? I do not see anything criminal in this output. Can me exactly point out the results which are not correct?

You have wrote the following query:

select count(1) from (select x.clientversion a ,cs_count(x.uin) b,cs_count(x.revision) c,cs_count(x.logtime) d, cs_count(x.class1) e, cs_count(x.class2) f, cs_count(x.class3) g ,cs_count(x.revision) h,cs_count(x.phoneid) i from wx_version,crash_log_get(clientversion) as x where x is not null) t where b <> c or b<> d or b<> e or b<> f or b <> g or b<> h or b<> i;

Try to run it and check that there are no inconsistencies.

amutu commented 10 years ago

select crashlog_delete(clientversion,'2014-04-16 09:50:06','2014-04-16 09:50:30') from wx_version;

crashlog_delete

           1
           1
           1
           1
           1
           1
           1

should be

select crashlog_delete(clientversion,'2014-04-16 09:50:06','2014-04-16 09:50:30') from wx_version;

crashlog_delete

           0
           0
           0
           0
           0
           0
           0
knizhnik commented 10 years ago

I was bug in imcs_search, fixed. Thank you for your help

amutu commented 10 years ago

can't reproduce ,thanks for you work!