taosdata / TDengine

High-performance, scalable time-series database designed for Industrial IoT (IIoT) scenarios
https://tdengine.com
GNU Affero General Public License v3.0
23.24k stars 4.84k forks source link

taosd crashes with Segmentation fault when trying to query table with PRIMARY KEY column defined #26090

Closed mev closed 3 months ago

mev commented 3 months ago

Bug Description If you create a table that has a PRIMARY KEY column defined apart from the TIMESTAMP column, the table creation works, the INSERT calls seem to work (no visible error) but then any SELECT query results in the crash of the taosd process with Segmentation fault.

To Reproduce Steps to reproduce the behavior: I started taosd and the taos console. I am working from the taos console trying to create a table with a second column added as PRIMARY KEY in order to try and see if for different values of this column the same timestamp can be used. I start by creating the table:

  1. table creation works:
    
    taos> CREATE TABLE testTable (ts TIMESTAMP, obj BINARY(128) PRIMARY KEY, val INT);
    Create OK, 0 row(s) affected (0.001202s)

taos> describe testTable; field | type | length | note | encode | compress | level |

ts | TIMESTAMP | 8 | | delta-i | lz4 | medium | obj | VARCHAR | 128 | PRIMARY KEY | disabled | lz4 | medium | val | INT | 4 | | simple8b | lz4 | medium | Query OK, 3 row(s) in set (0.001622s)


2. data insertion seems to work (I see no errors either in the taos console or in the taosd log):

taos> INSERT INTO testTable VALUES ('2023-10-10 23:12:34.123', 'test 1', 1); Insert OK, 1 row(s) affected (0.001031s)

taos> INSERT INTO testTable VALUES ('2023-10-10 23:12:34.123', 'test 2', 2); Insert OK, 1 row(s) affected (0.001031s)

I can stop after a single INSERT or I can do multiple INSERTs. The results is the same. At the next step, even when doing a single INSERT, taosd crashes with a Segmentation fault.

3. run any query:

taos> SELECT * FROM testTable;

DB error: Unable to establish connection (11.625067s)

and taosd crashes with the following output:

06/07 16:27:30.647765 00702816 C UTL FATAL crash signal is 11 06/07 16:27:30.647818 00702816 C UTL FATAL sender PID:0 cmdline: 06/07 16:27:30.653185 00702816 C UTL FATAL obtained 21 stack frames 06/07 16:27:30.653190 00702816 C UTL FATAL frame:0, /lib/x86_64-linux-gnu/libc.so.6(+0x1a67fc) [0x7f799bba67fc] 06/07 16:27:30.653196 00702816 C UTL FATAL frame:1, taosd(+0x47f054) [0x559d85b17054] 06/07 16:27:30.653199 00702816 C UTL FATAL frame:2, taosd(+0x491183) [0x559d85b29183] 06/07 16:27:30.653201 00702816 C UTL FATAL frame:3, taosd(+0x4855a9) [0x559d85b1d5a9] 06/07 16:27:30.653206 00702816 C UTL FATAL frame:4, taosd(+0x48c6d1) [0x559d85b246d1] 06/07 16:27:30.653208 00702816 C UTL FATAL frame:5, taosd(+0x492f0e) [0x559d85b2af0e] 06/07 16:27:30.653211 00702816 C UTL FATAL frame:6, taosd(+0x492f95) [0x559d85b2af95] 06/07 16:27:30.653215 00702816 C UTL FATAL frame:7, taosd(tsdbNextDataBlock2+0x278) [0x559d85b2b23a] 06/07 16:27:30.653218 00702816 C UTL FATAL frame:8, taosd(+0xbae8af) [0x559d862468af] 06/07 16:27:30.653221 00702816 C UTL FATAL frame:9, taosd(+0xbaec20) [0x559d86246c20] 06/07 16:27:30.653224 00702816 C UTL FATAL frame:10, taosd(+0xbaf7eb) [0x559d862477eb] 06/07 16:27:30.653231 00702816 C UTL FATAL frame:11, taosd(+0xbafbde) [0x559d86247bde] 06/07 16:27:30.653235 00702816 C UTL FATAL frame:12, taosd(qExecTaskOpt+0x423) [0x559d86230034] 06/07 16:27:30.653239 00702816 C UTL FATAL frame:13, taosd(qwExecTask+0x1b6) [0x559d86215b51] 06/07 16:27:30.653241 00702816 C UTL FATAL frame:14, taosd(qwProcessQuery+0x40a) [0x559d86219d33] 06/07 16:27:30.653245 00702816 C UTL FATAL frame:15, taosd(qWorkerProcessQueryMsg+0x445) [0x559d8620c1f1] 06/07 16:27:30.653250 00702816 C UTL FATAL frame:16, taosd(vnodeProcessQueryMsg+0x205) [0x559d85a58105] 06/07 16:27:30.653253 00702816 C UTL FATAL frame:17, taosd(+0x3966e8) [0x559d85a2e6e8] 06/07 16:27:30.653255 00702816 C UTL FATAL frame:18, taosd(+0xfedd0b) [0x559d86685d0b] 06/07 16:27:30.653257 00702816 C UTL FATAL frame:19, /lib/x86_64-linux-gnu/libc.so.6(+0x94ac3) [0x7f799ba94ac3] 06/07 16:27:30.653262 00702816 C UTL FATAL frame:20, /lib/x86_64-linux-gnu/libc.so.6(+0x126850) [0x7f799bb26850] Segmentation fault (core dumped)


**Expected Behavior**
I was expecting to be able to run the following two INSERTS:

taos> INSERT INTO testTable VALUES ('2023-10-10 23:12:34.123', 'test 1', 1); Insert OK, 1 row(s) affected (0.001031s)

taos> INSERT INTO testTable VALUES ('2023-10-10 23:12:34.123', 'test 2', 2); Insert OK, 1 row(s) affected (0.001031s)

and then, when doing the query, I was expecting to get:

taos> SELECT * FROM testTable; ts | obj | val |

2023-10-10 23:12:34.123000 | test 1 | 1 | 2023-10-10 23:12:34.123000 | test 2 | 2 | Query OK, 2 row(s) in set (0.001541s)


**Environment (please complete the following information):**
 - OS: Ubuntu 22.04 LTS
 - Memory:

$ cat /proc/meminfo MemTotal: 197600692 kB MemFree: 15425872 kB MemAvailable: 122060496 kB Buffers: 7822696 kB Cached: 82220176 kB SwapCached: 87792 kB Active: 67163900 kB Inactive: 86333420 kB Active(anon): 126200 kB Inactive(anon): 63716424 kB Active(file): 67037700 kB Inactive(file): 22616996 kB Unevictable: 64 kB Mlocked: 64 kB SwapTotal: 15999996 kB SwapFree: 14018300 kB Zswap: 0 kB Zswapped: 0 kB Dirty: 64 kB Writeback: 0 kB AnonPages: 63381596 kB Mapped: 2093172 kB Shmem: 388168 kB KReclaimable: 18925672 kB Slab: 21568468 kB SReclaimable: 18925672 kB SUnreclaim: 2642796 kB KernelStack: 59568 kB PageTables: 255620 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 114800340 kB Committed_AS: 90395840 kB VmallocTotal: 13743895347199 kB VmallocUsed: 629640 kB VmallocChunk: 0 kB Percpu: 165888 kB HardwareCorrupted: 0 kB AnonHugePages: 0 kB ShmemHugePages: 0 kB ShmemPmdMapped: 0 kB FileHugePages: 0 kB FilePmdMapped: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB Hugetlb: 0 kB DirectMap4k: 3063476 kB DirectMap2M: 142442496 kB DirectMap1G: 56623104 kB

 - CPU: 32 cores, 64 threads

$ cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 25 model : 17 model name : AMD EPYC 9354P 32-Core Processor stepping : 1 microcode : 0xa101116 cpu MHz : 1496.229 cache size : 1024 KB physical id : 0 siblings : 64 core id : 0 cpu cores : 32 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 16 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 invpcid_single hw_pstate ssbd mba perfmon_v2 ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local avx512_bf16 clzero irperf xsaveerptr rdpru wbnoinvd amd_ppin cppc arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq la57 rdpid overflow_recov succor smca fsrm flush_l1d bugs : sysret_ss_attrs spectre_v1 spectre_v2 spec_store_bypass bogomips : 6500.11 TLB size : 3584 4K pages clflush size : 64 cache_alignment : 64 address sizes : 52 bits physical, 57 bits virtual power management: ts ttp tm hwpstate cpb eff_freq_ro [13] [14] ...

 - current Disk Space:

$ df -h Filesystem Size Used Avail Use% Mounted on ... /dev/nvme0n1p3 451G 369G 60G 87% / ...


 - TDengine Version: 3.3.0.0
yu285 commented 3 months ago

thanks for your feedback , we are going to check on this

zk66214 commented 3 months ago

I can't reproduce this issue, I need more details about the version information, could you please type "taosd -V" and paste the response info here

mev commented 3 months ago

I can't reproduce this issue, I need more details about the version information, could you please type "taosd -V" and paste the response info here

Here it is:

$ taosd -V
community version: 3.3.0.0 compatible_version: 3.0.0.0
gitinfo: no git commit id
buildInfo: Built Linux-x64 at 2024-06-05 13:51:21 +0300

I built it locally and the code is the one from the 3.3.0.0 release from GitHub, available here: https://github.com/taosdata/TDengine/releases/tag/ver-3.3.0.0.

hjxilinx commented 3 months ago

Thanks for reporting this issue. We have fixed it in this commit: https://github.com/taosdata/TDengine/commit/9300b7a4019bc5b2a800eb55b465bb59ccee1d3a

This bug has been fixed since 3.3.0.3. The newest released version, which is 3.3.1.0, is recommended.