apache / iotdb

Apache IoTDB
https://iotdb.apache.org/
Apache License 2.0
5.54k stars 1k forks source link

An easy method to count the total number of time-series data #6762

Open reinal001 opened 2 years ago

reinal001 commented 2 years ago

I used workbench to write a large number of data into iotdb, e.g. 50000 devices and 100 sensors for each device. Then I wanted to verify whether the total number of data actually written into iotdb is equal to the number expected.

I used sql statements like "select count(*) from root.test.* group by level =3", but can't get the correct answer which should be 50000\100. It seems like that result columns over 1000 has been truncated.

What I expect is, when I execute one single sql statement, it can return the correct number of all data in db. In industry, large data is ubiquitous, so a good support for large data might be a common scene.

Now, I used a sql and shell combined way to verify the number, such like "select count(*) from root.test.**.d_${i} group by level =3". Here i ranges from 1 to 50000. And finally add 50000 results together to get the correct number.

github-actions[bot] commented 2 years ago

Hi, this is your first issue in IoTDB project. Thanks for your report. Welcome to join the community!

qiaojialin commented 2 years ago

You could enlarge the max_deduplicated_path_num=1000 to let the "select count(*) from root.test.** group by level =3" executable.

The total point of a database is hard to maintain a precise number because of duplicated insertion and deletion, we could only maintain a rough number.