Open melvynator opened 2 months ago
In be.INFO I can find the following:
W0912 11:56:37.092803 43311 fragment_mgr.cpp:201] Fail to open fragment e8435333-64f6-ac43-8de0-7390222881b5: Cancelled: Cancelled because of runtime state is cancelled
/build/starrocks/be/src/exec/connector_scan_node.cpp:66 _data_source->get_next(state, chunk)
/build/starrocks/be/src/runtime/plan_fragment_executor.cpp:322 _plan->get_next(_runtime_state, &_chunk, &_done)
/build/starrocks/be/src/runtime/plan_fragment_executor.cpp:228 _get_next_internal_vectorized(&chunk)
I0912 11:56:37.094070 43311 plan_fragment_executor.cpp:483] Fragment e8435333-64f6-ac43-8de0-7390222881b5:(Active: 10m, non-child: 0.23%)
- InstanceAllocatedMemoryUsage: 233.76 GB
- InstanceDeallocatedMemoryUsage: 143.10 GB
- InstancePeakMemoryUsage: 237.51 MB
- MemoryLimit: -1.00 B
- RowsProduced: 87.63M
OlapTableSink:(Active: 3m, non-child: 29.97%)
- TxnID: 3
- IndexNum: 1
- ReplicatedStorage: true
- AutomaticPartition: false
- AllocAutoIncrementTime: 4.290ms
- CloseWaitTime: 0.000ns
- OpenTime: 1.572ms
- PrepareDataTime: 9s057ms
- ConvertChunkTime: 49.047ms
- ValidateDataTime: 7s773ms
- RowsFiltered: 0
- RowsRead: 87.63M
- RowsReturned: 87.63M
- RpcClientSideTime: 3m31s
- RpcServerSideTime: 0.000ns
- RpcServerWaitFlushTime: 0.000ns
- SendDataTime: 2m50s
- PackChunkTime: 42s199ms
- SendRpcTime: 0.000ns
- CompressTime: 0.000ns
- SerializeChunkTime: 0.000ns
- WaitResponseTime: 2m5s
FILE_SCAN_NODE (id=0):(Active: 6m59s, non-child: 69.80%)
- BytesRead: 0
- IOTaskExecTime: 0.000ns
- IOTaskWaitTime: 0.000ns
- NumDiskAccess: 0
- PeakMemoryUsage: 0
- RowsRead: 0
- RowsReturned: 87.63M
- RowsReturnedRate: 208.98 K/sec
- ScanTime: 8m19s
- ScannerQueueCounter: 147
- ScannerQueueTime: 9m56s
- ScannerThreadsInvoluntaryContextSwitches: 0
- ScannerThreadsTotalWallClockTime: 0.000ns
- MaterializeTupleTime(*): 0.000ns
- ScannerThreadsSysTime: 0.000ns
- ScannerThreadsUserTime: 0.000ns
- ScannerThreadsVoluntaryContextSwitches: 0
- TotalRawReadTime(*): 0.000ns
- TotalReadThroughput: 0.00 /sec
DataSource:
- DataSourceType: FileDataSource
- FileScanner: 0
- CastChunkTime: 0.000ns
- CreateChunkTime: 0.000ns
- FileReadCount: 0
- FileReadTime: 0.000ns
- FillTime: 0.000ns
- MaterializeTime: 0.000ns
- ReadTime: 0.000ns
- ScannerTotalTime: 0.000ns
W0912 11:56:37.094082 43311 stream_load_executor.cpp:102] fragment execute failed, query_id=e843533364f6ac43-8de07390222881b4, err_msg=Cancelled because of runtime state is cancelled, id=e843533364f6ac43-8de07390222881b4, job_id=-1, txn_id: 3, label=hits_tsv_1726141596, db=hits
W0912 11:56:37.094142 43680 stream_load.cpp:373] append body content failed. errmsg=Cancelled: Cancelled because of runtime state is cancelled
/build/starrocks/be/src/exec/connector_scan_node.cpp:66 _data_source->get_next(state, chunk)
/build/starrocks/be/src/runtime/plan_fragment_executor.cpp:322 _plan->get_next(_runtime_state, &_chunk, &_done)
/build/starrocks/be/src/runtime/plan_fragment_executor.cpp:228 _get_next_internal_vectorized(&chunk) context=id=e843533364f6ac43-8de07390222881b4, job_id=-1, txn_id: 3, label=hits_tsv_1726141596, db=hits
In the fe.warn.log I found:
2024-09-12 11:57:51.348Z WARN (thrift-server-pool-0|165) [FrontendServiceImpl.loadTxnRollback():1489] failed to rollback txn 3: transaction not found: 3
@melvynator Sorry for late delay. Let me have a reproduce in the 3.1.15
@chaoyli Any luck?
@melvynator No luck, I'll push our team to fix it this week. They are on the national holiday last week.
@melvynator thanks for your feedback. which type of instance did you test on? Could you provide some more details about your test environment?
@melvynator Thanks for your feedback. I tried to reproduce this issue in 3.1.15, but I inserted data successfully. Could you provide some details about your test to help us reproduce this issue ?
Yes, I used a c6a.4xlarge instance on AWS with a Linux flavor.
I was able to load the data, but to do so, I resorted to cutting the file into 5 parts and inserting them each individually; this approach worked.
Yes, I used a c6a.4xlarge instance on AWS with a Linux flavor.
I was able to load the data, but to do so, I resorted to cutting the file into 5 parts and inserting them each individually; this approach worked.
@melvynator Thanks for your feedback. I guess the reason for the insert failure is that the c6a.4xlarge instance has lower specifications.
I'm reproducing this benchmark on the latest stable version: https://releases.starrocks.io/starrocks/StarRocks-3.1.15-centos-amd64.tar.gz
Steps to reproduce the behavior (Required)
@chaoyli I'm trying to reproduce this benchmark: https://github.com/ClickHouse/ClickBench/blob/main/starrocks/benchmark.sh
Whenever I run this section:
Starrocks fail to insert data with an obscure message.
Expected behavior (Required)
The insert should work.
Real behavior (Required)
I've got the following error:
StarRocks version (Required)
3.1.15