taosdata / TDengine

High-performance, scalable time-series database designed for Industrial IoT (IIoT) scenarios
https://tdengine.com
GNU Affero General Public License v3.0
23.35k stars 4.85k forks source link

some tag missed in sum query #21441

Closed A7610605 closed 1 year ago

A7610605 commented 1 year ago

Bug Description 数据是由taosBenchmark 生成100万张子表,每个子表100行数据 以tag 中的cc来统计 ,下面的查询语句生成的统计数据中有一行无tag的数据。 select cc,sum(val) from (select cc,last(d_count) as val from meters group by tbname) group by cc ORDER BY cc ASC;

 cc  |       sum(val)        |
==============================
     |                187264 |
 AD  |                498688 |
 AE  |                502144 |
 AF  |                519040 |
 AG  |                514560 |
 AI  |                512128 |
 AL  |                501120 |
 AM  |                499072 |
 AO  |                500736 |

同时查询单独的tag的数据得到的结果不同 select cc,sum(val) from (select cc,last(d_count) as val from meters where cc='AD' group by tbname) group by cc ORDER BY cc ASC;

 cc  |       sum(val)        |
==============================
 AD  |                499072 |

每个子表的last(d_count)的数据都是相同的128,如果统计子表数量3899×128=499072则第二个查询的数据是正确的。同时第一个查询中的所有数据加总为128000000,也跟128100万相符合,所以我认为是第一个查询中部分数据的tag莫名其妙丢了。 `select cc,count() from (SELECT DISTINCT TBNAME,cc FROM meters) group by cc order by cc ASC;`

 cc  |       count(*)        |
==============================
 AD  |                  3899 |
 AE  |                  3929 |
 AF  |                  4062 |
 AG  |                  4030 |
 AI  |                  4009 |
 AL  |                  3922 |
 AM  |                  3906 |
 AO  |                  3915 |

To Reproduce Steps to reproduce the behavior: 先建立数据库 CREATE DATABASE `test2` BUFFER 96 CACHESIZE 256 CACHEMODEL 'both' COMP 2 DURATION 30D WAL_FSYNC_PERIOD 3000 MAXROWS 4096 MINROWS 100 STT_TRIGGER 8 KEEP 3650d,3650d,3650d PAGES 256 PAGESIZE 4 PRECISION 'ms' REPLICA 1 WAL_LEVEL 1 VGROUPS 8 SINGLE_STABLE 0 TABLE_PREFIX 0 TABLE_SUFFIX 0 TSDB_PAGESIZE 4 WAL_RETENTION_PERIOD 0 WAL_RETENTION_SIZE 0 WAL_ROLL_PERIOD 0 WAL_SEGMENT_SIZE 0; 然后用taosbenchmark生成数据,json文件如下所示

{
    "filetype": "insert",
    "cfgdir": "/etc/taos",
    "host": "127.0.0.1",
    "port": 6030,
    "user": "root",
    "password": "taosdata",
    "connection_pool_size": 8,
    "thread_count": 8,
    "create_table_thread_count": 8,
    "result_file": "./insert_res.txt",
    "confirm_parameter_prompt": "no",
    "insert_interval": 0,
    "interlace_rows": 100,
    "num_of_records_per_req": 100,
    "prepared_rand": 10000,
    "chinese": "no",
    "escape_character": "yes",
    "databases": [
        {
            "dbinfo": {
                "name": "test2",
                "drop": "no",
                "precision": "ms"
            },
            "super_tables": [
                {
                    "name": "meters",
                    "child_table_exists": "no",
                    "childtable_count": 1000000,
                    "childtable_prefix": "d",
                    "auto_create_table": "no",
                    "batch_create_tbl_num": 10,
                    "data_source": "rand",
                    "insert_mode": "taosc",
                    "non_stop_mode": "no",
                    "line_protocol": "line",
                    "insert_rows": 100,
                    "childtable_limit": 1000000,
                    "childtable_offset": 10,
                    "interlace_rows": 0,
                    "insert_interval": 0,
                    "partial_col_num": 0,
                    "timestamp_step": 86400000,
                    "start_timestamp": "2022-5-15 00:01:08.000",
                    "sample_format": "csv",
                    "sample_file": "./sample.csv",
                    "use_sample_ts": "no",
                    "tags_file": "",
                    "columns": [
                        { "type": "INT","name": "d_count","count": 1,"max": 256,"min": 1}
                    ],
                    "tags": [
                        {
                            "name": "cc",
                            "type": "BINARY",
                            "len": 3,
                            "values": ["AD","AE","AF","AG","AI","AL","AM","AO","AQ","AR","AS","AT","AU","AW","AX","AZ","BA","BB","BD","BE","BF","BG","BH","BI","BJ","BL","BM","BN","BO","BQ","BR","BS","BT","BV","BW","BY","BZ","CA","CC","CD","CF","CG","CH","CI","CK","CL","CM","CN","CO","CR","CU","CV","CW","CX","CY","CZ","DE","DJ","DK","DM","DO","DZ","EC","EE","EG","EH","ER","ES","ET","FI","FJ","FK","FM","FO","FR","GA","GB","GD","GE","GF","GG","GH","GI","GL","GM","GN","GP","GQ","GR","GS","GT","GU","GW","GY","HK","HM","HN","HR","HT","HU","ID","IE","IL","IM","IN","IO","IQ","IR","IS","IT","JE","JM","JO","JP","KE","KG","KH","KI","KM","KN","KP","KR","KW","KY","KZ","LA","LB","LC","LI","LK","LR","LS","LT","LU","LV","LY","MA","MC","MD","ME","MF","MG","MH","MK","ML","MM","MN","MO","MP","MQ","MR","MS","MT","MU","MV","MW","MX","MY","MZ","NA","NC","NE","NF","NG","NI","NL","NO","NP","NR","NU","NZ","OM","PA","PE","PF","PG","PH","PK","PL","PM","PN","PR","PS","PT","PW","PY","QA","RE","RO","RS","RU","RW","SA","SB","SC","SD","SE","SG","SH","SI","SJ","SK","SL","SM","SN","SO","SR","SS","ST","SV","SX","SY","SZ","TC","TD","TF","TG","TH","TJ","TK","TL","TM","TN","TO","TR","TT","TV","TW","TZ","UA","UG","UM","US","UY","UZ","VA","VC","VE","VG","VI","VN","VU","WF","WS","XK","YE","YT","ZA","ZM","ZW"]
                        }
                    ]
                }
            ]
        }
    ]
}

Expected Behavior 第一个查询语句统计出正确的数字

Screenshots If applicable, add screenshots to help explain your problem.

Environment (please complete the following information):

Additional Context 非常离奇了,设置VGROUPS为8的时候会出现,如果改成16就是正常的

A7610605 commented 1 year ago

多次验证,VGROUPS设置为 12或者16的时候就正常。而设置成8一定会出现此问题

A7610605 commented 1 year ago

补充,可能跟子表数量有关,设置VGROUPS为16时,增加到200万子表也会有同样的问题。设置VGROUPS为8时,84万子表正常,88万子表则会出现此问题。

yu285 commented 1 year ago

我们看下

yu285 commented 1 year ago

这个问题我们已经修复,在测试中。发布后会通知咱们。

yu285 commented 1 year ago

问题已经修复,可以在 3.0.5.0 验证这个问题了

yu285 commented 1 year ago

您好请问有做验证吗?

gccgdb1234 commented 1 year ago

超过三周不回复,问题关闭