Closed alberttwong closed 5 months ago
it seems to be doing 250k rows at a time.
it just stops.
2m47s 26,760,893 153620 r/s | 0 B/s | 81% MEM | 27% CPU 2024-03-15 10:15:21 DBG loading /var/folders/xn/_h8gfpfs3vv_m6gkxm8t0l380000gn/T/starrocks/testing/user_behavior_tmp/2024-03-15T101233.285/part.01.0108.csv [21 MB] ds.1710522751246.ooT-0
2024-03-15 10:15:21 DBG stream-load completed for /var/folders/xn/_h8gfpfs3vv_m6gkxm8t0l380000gn/T/starrocks/testing/user_behavior_tmp/2024-03-15T101233.285/part.01.0108.csv => {
"TxnId": 629,
"Label": "5b7e4e91-8cd6-418c-bd6c-3524d8136fec",
"Status": "Success",
"Message": "OK",
"NumberTotalRows": 250000,
"NumberLoadedRows": 250000,
"NumberFilteredRows": 0,
"NumberUnselectedRows": 0,
"LoadBytes": 21160895,
"LoadTimeMs": 251,
"BeginTxnTimeMs": 0,
"StreamLoadPlanTimeMs": 1,
"ReadDataTimeMs": 96,
"WriteDataTimeMs": 233,
"CommitAndPublishTimeMs": 15
}
2m49s 27,083,971 154568 r/s | 0 B/s | 81% MEM | 32% CPU 2024-03-15 10:15:23 DBG loading /var/folders/xn/_h8gfpfs3vv_m6gkxm8t0l380000gn/T/starrocks/testing/user_behavior_tmp/2024-03-15T101233.285/part.01.0109.csv [21 MB] ds.1710522751246.ooT-0
2024-03-15 10:15:23 DBG stream-load completed for /var/folders/xn/_h8gfpfs3vv_m6gkxm8t0l380000gn/T/starrocks/testing/user_behavior_tmp/2024-03-15T101233.285/part.01.0109.csv => {
"TxnId": 630,
"Label": "d9f284b7-b4dc-4331-be59-9bfb249655eb",
"Status": "Success",
"Message": "OK",
"NumberTotalRows": 250000,
"NumberLoadedRows": 250000,
"NumberFilteredRows": 0,
"NumberUnselectedRows": 0,
"LoadBytes": 21160259,
"LoadTimeMs": 251,
"BeginTxnTimeMs": 0,
"StreamLoadPlanTimeMs": 0,
"ReadDataTimeMs": 109,
"WriteDataTimeMs": 235,
"CommitAndPublishTimeMs": 15
}
2m50s 27,235,258 154357 r/s | 0 B/s | 81% MEM | 33% CPU 2024-03-15 10:15:24 DBG loading /var/folders/xn/_h8gfpfs3vv_m6gkxm8t0l380000gn/T/starrocks/testing/user_behavior_tmp/2024-03-15T101233.285/part.01.0110.csv [21 MB] ds.1710522751246.ooT-0
3m44s 32,000,001 29977 r/s | 0 B/s | 81% MEM | 13% CPU
everything looks fine from the starrocks side. what is weird is that the txn id is 643 but it stopped on 630 on the sling side.
2024-03-15 17:17:39,990 INFO (AutoStatistic|27) [StatisticAutoCollector.runAfterCatalogReady():70] auto collect full statistic on all databases start
2024-03-15 17:17:39,991 INFO (AutoStatistic|27) [StatisticAutoCollector.runAfterCatalogReady():88] auto collect full statistic on all databases end
2024-03-15 17:17:40,156 INFO (tablet scheduler|35) [ClusterLoadStatistic.classifyBackendByLoad():163] classify backend by load. medium: HDD, avg load score: 1.0
, low/mid/high: 0/1/0
2024-03-15 17:17:40,156 INFO (tablet scheduler|35) [TabletScheduler.updateClusterLoadStatistic():481] update cluster load statistic:
be id: 10004, is available: true, mediums: [{medium: HDD, replica: 144, used: 2102149191, total: 141.5GB, score: 1.0},{medium: SSD, replica: 0, used: 0, tot
al: 0B, score: NaN},], paths: [{path: /data/deploy/starrocks/be/storage, path hash: -3235560536975967549, be: 10004, medium: HDD, used: 2102149191, total: 15202
2007879},]
2024-03-15 17:17:41,697 INFO (MemoryUsageTracker|50) [MemoryUsageTracker.trackMemory():119] (71ms) Module Backup - BackupHandler estimated 44.1MB of memory. Con
tains BackupOrRestoreJob with 0 object(s).
2024-03-15 17:17:41,697 INFO (MemoryUsageTracker|50) [MemoryUsageTracker.trackMemory():119] (0ms) Module Coordinator - QeProcessorImpl estimated 224B of memory.
Contains QueryInfo with 0 object(s).
2024-03-15 17:17:41,697 INFO (MemoryUsageTracker|50) [MemoryUsageTracker.trackMemory():119] (0ms) Module Delete - DeleteMgr estimated 11.7KB of memory. Contains
DeleteInfo with 29 object(s).
2024-03-15 17:17:41,697 INFO (MemoryUsageTracker|50) [MemoryUsageTracker.trackMemory():119] (0ms) Module Dict - CacheDictManager estimated 1.7KB of memory. Cont
ains ColumnDict with 0 object(s).
2024-03-15 17:17:41,697 INFO (MemoryUsageTracker|50) [MemoryUsageTracker.trackMemory():119] (0ms) Module Export - ExportMgr estimated 248B of memory. Contains E
xportJob with 0 object(s).
2024-03-15 17:17:41,739 INFO (MemoryUsageTracker|50) [MemoryUsageTracker.trackMemory():119] (42ms) Module Load - LoadMgr estimated 44.2MB of memory. Contains Lo
adJob with 111 object(s).
2024-03-15 17:17:41,739 INFO (MemoryUsageTracker|50) [MemoryUsageTracker.trackMemory():119] (0ms) Module Load - RoutineLoadMgr estimated 680B of memory. Contain
s RoutineLoad with 0 object(s).
2024-03-15 17:17:41,739 INFO (MemoryUsageTracker|50) [MemoryUsageTracker.trackMemory():119] (0ms) Module Load - StreamLoadMgr estimated 440B of memory. Contains
StreamLoad with 0 object(s).
2024-03-15 17:17:41,739 INFO (MemoryUsageTracker|50) [MemoryUsageTracker.trackMemory():119] (0ms) Module LocalCatalog - InternalCatalogMemoryTracker estimated 8
1.3KB of memory. Contains Partition with 29 object(s).
2024-03-15 17:17:41,739 INFO (MemoryUsageTracker|50) [MemoryUsageTracker.trackMemory():119] (0ms) Module Profile - ProfileManager estimated 112B of memory. Cont
ains Profile with 0 object(s). LoadProfile with 0 object(s).
2024-03-15 17:17:41,758 INFO (MemoryUsageTracker|50) [MemoryUsageTracker.trackMemory():119] (19ms) Module Report - ReportHandler estimated 15.1MB of memory. Con
tains PendingTask with 0 object(s). ReportQueue with 0 object(s).
2024-03-15 17:17:41,759 INFO (MemoryUsageTracker|50) [MemoryUsageTracker.trackMemory():119] (1ms) Module Tablet - TabletInvertedIndex estimated 93.1KB of memory
. Contains TabletMeta with 144 object(s).
2024-03-15 17:17:41,759 INFO (MemoryUsageTracker|50) [MemoryUsageTracker.trackMemory():119] (0ms) Module Task - TaskManager estimated 80B of memory. Contains Ta
sk with 0 object(s).
2024-03-15 17:17:41,802 INFO (MemoryUsageTracker|50) [MemoryUsageTracker.trackMemory():119] (43ms) Module Task - TaskRunManager estimated 44.3MB of memory. Cont
ains PendingTaskRun with 0 object(s). RunningTaskRun with 0 object(s). HistoryTaskRun with 0 object(s).
2024-03-15 17:17:41,835 INFO (MemoryUsageTracker|50) [MemoryUsageTracker.trackMemory():119] (33ms) Module Transaction - GlobalTransactionMgr estimated 44.3MB of
memory. Contains Transaction with 340 object(s).
2024-03-15 17:17:43,546 INFO (statistics meta manager|26) [StatisticExecutor.executeDML():338] statistic execute DML | QueryId [ef55262b-e2ef-11ee-b916-0242ac11
0002] | SQL: DELETE FROM table_statistic_v1 WHERE TABLE_ID = 10672
2024-03-15 17:17:43,547 INFO (statistics meta manager|26) [DatabaseTransactionMgr.beginTransaction():315] begin transaction: txn_id: 632 with label delete_78dd7
a52-5911-49ac-bb7d-e13e02a51ddb from coordinator FE: 127.0.0.1, listner id: 10682
2024-03-15 17:17:43,548 INFO (statistics meta manager|26) [TxnStateCallbackFactory.addCallback():37] add callback of txn state : 10682. current callback size: 1
2024-03-15 17:17:43,548 INFO (statistics meta manager|26) [OlapDeleteJob.run():189] countDownLatch count: 10
2024-03-15 17:17:43,548 INFO (statistics meta manager|26) [OlapDeleteJob.run():192] waiting delete Job finish, signature: 632, timeout: 30000
2024-03-15 17:17:43,550 INFO (thrift-server-pool-0|130) [PushTask.countDownLatch():218] pushTask current latch count: 9. backend: 10004, tablet:10025
2024-03-15 17:17:43,551 INFO (thrift-server-pool-5|140) [PushTask.countDownLatch():218] pushTask current latch count: 8. backend: 10004, tablet:10027
2024-03-15 17:17:43,551 INFO (thrift-server-pool-4|134) [PushTask.countDownLatch():218] pushTask current latch count: 7. backend: 10004, tablet:10023
2024-03-15 17:17:43,551 INFO (thrift-server-pool-1|131) [PushTask.countDownLatch():218] pushTask current latch count: 6. backend: 10004, tablet:10021
2024-03-15 17:17:43,551 INFO (thrift-server-pool-2|132) [PushTask.countDownLatch():218] pushTask current latch count: 5. backend: 10004, tablet:10019
2024-03-15 17:17:43,551 INFO (thrift-server-pool-0|130) [PushTask.countDownLatch():218] pushTask current latch count: 4. backend: 10004, tablet:10017
2024-03-15 17:17:43,552 INFO (thrift-server-pool-5|140) [PushTask.countDownLatch():218] pushTask current latch count: 3. backend: 10004, tablet:10015
2024-03-15 17:17:43,552 INFO (thrift-server-pool-4|134) [PushTask.countDownLatch():218] pushTask current latch count: 2. backend: 10004, tablet:10013
2024-03-15 17:17:43,552 INFO (thrift-server-pool-1|131) [PushTask.countDownLatch():218] pushTask current latch count: 1. backend: 10004, tablet:10011
2024-03-15 17:17:43,552 INFO (thrift-server-pool-2|132) [PushTask.countDownLatch():218] pushTask current latch count: 0. backend: 10004, tablet:10009
2024-03-15 17:17:43,552 INFO (statistics meta manager|26) [OlapDeleteJob.run():215] delete job finish, countDownLatch count: 0
2024-03-15 17:17:43,552 WARN (statistics meta manager|26) [OlapDeleteJob.run():227]
2024-03-15 17:17:43,552 INFO (statistics meta manager|26) [OlapDeleteJob.checkAndUpdateQuorum():303] check delete job quorum, txn_id: 632, total tablets: 10, qu
orum tablets: 10,
2024-03-15 17:17:43,554 INFO (statistics meta manager|26) [DatabaseTransactionMgr.commitTransaction():461] transaction:[TransactionState. txn_id: 632, label: de
lete_78dd7a52-5911-49ac-bb7d-e13e02a51ddb, db id: 10005, table id list: 10007, callback id: 10682, coordinator: FE: 127.0.0.1, transaction status: COMMITTED, er
ror replicas num: 0, replica ids: , prepare time: 1710523063547, commit time: 1710523063552, finish time: -1, write cost: 5ms, reason: attachment: com.starrock
s.transaction.InsertTxnCommitAttachment@2aa3b57b tabletCommitInfos size: 10] successfully committed
2024-03-15 17:17:43,556 INFO (PUBLISH_VERSION|24) [PublishVersionDaemon.publishVersionForOlapTable():237] send publish tasks for txn_id: 632
2024-03-15 17:17:43,568 INFO (PUBLISH_VERSION|24) [DeleteMgr.recordFinishedJob():419] record finished deleteJob, transactionId 632, dbId 10005
2024-03-15 17:17:43,568 INFO (PUBLISH_VERSION|24) [TxnStateCallbackFactory.removeCallback():44] remove callback of txn state : 10682. current callback size: 0
2024-03-15 17:17:43,568 INFO (statistics meta manager|26) [StatisticExecutor.executeDML():338] statistic execute DML | QueryId [ef58818c-e2ef-11ee-b916-0242ac11
0002] | SQL: DELETE FROM column_statistics WHERE TABLE_ID = 10672
2024-03-15 17:17:43,569 INFO (PUBLISH_VERSION|24) [DatabaseTransactionMgr.finishTransaction():1004] finish transaction TransactionState. txn_id: 632, label: del
ete_78dd7a52-5911-49ac-bb7d-e13e02a51ddb, db id: 10005, table id list: 10007, callback id: 10682, coordinator: FE: 127.0.0.1, transaction status: VISIBLE, error
replicas num: 0, replica ids: , prepare time: 1710523063547, commit time: 1710523063552, finish time: 1710523063567, write cost: 5ms, wait for publish cost: 4m
s, publish rpc cost: 3ms, finish txn cost: 8ms, publish total cost: 15ms, total cost: 20ms, reason: attachment: com.starrocks.transaction.InsertTxnCommitAttach
ment@2aa3b57b tabletCommitInfos size: 10 successfully
2024-03-15 17:17:43,571 INFO (statistics meta manager|26) [DatabaseTransactionMgr.beginTransaction():315] begin transaction: txn_id: 643 with label delete_ef588
18c-e2ef-11ee-b916-0242ac110002 from coordinator FE: 127.0.0.1, listner id: -1
2024-03-15 17:17:43,572 INFO (statistics meta manager|26) [QeProcessorImpl.registerQuery():103] register query id = ef58818c-e2ef-11ee-b916-0242ac110002
2024-03-15 17:17:43,572 INFO (statistics meta manager|26) [Coordinator.prepareResultSink():708] dispatch load job: ef58818c-e2ef-11ee-b916-0242ac110002 to [TNet
workAddress(hostname:127.0.0.1, port:9060)]
2024-03-15 17:17:43,605 INFO (statistics meta manager|26) [DatabaseTransactionMgr.commitTransaction():461] transaction:[TransactionState. txn_id: 643, label: de
lete_ef58818c-e2ef-11ee-b916-0242ac110002, db id: 10005, table id list: 10030, callback id: -1, coordinator: FE: 127.0.0.1, transaction status: COMMITTED, error
replicas num: 0, replica ids: , prepare time: 1710523063571, commit time: 1710523063604, finish time: -1, write cost: 33ms, reason: attachment: com.starrocks.
transaction.InsertTxnCommitAttachment@4c5e5f16 tabletCommitInfos size: 10] successfully committed
2024-03-15 17:17:43,613 INFO (PUBLISH_VERSION|24) [PublishVersionDaemon.publishVersionForOlapTable():237] send publish tasks for txn_id: 643
2024-03-15 17:17:43,668 INFO (statistics meta manager|26) [QeProcessorImpl.unregisterQuery():142] deregister query id = ef58818c-e2ef-11ee-b916-0242ac110002
2024-03-15 17:17:43,668 INFO (PUBLISH_VERSION|24) [DatabaseTransactionMgr.finishTransaction():1004] finish transaction TransactionState. txn_id: 643, label: del
ete_ef58818c-e2ef-11ee-b916-0242ac110002, db id: 10005, table id list: 10030, callback id: -1, coordinator: FE: 127.0.0.1, transaction status: VISIBLE, error re
plicas num: 0, replica ids: , prepare time: 1710523063571, commit time: 1710523063604, finish time: 1710523063667, write cost: 33ms, wait for publish cost: 9ms,
publish rpc cost: 48ms, finish txn cost: 6ms, publish total cost: 63ms, total cost: 96ms, reason: attachment: com.starrocks.transaction.InsertTxnCommitAttachm
ent@4c5e5f16 tabletCommitInfos size: 10 successfully
2024-03-15 17:17:45,494 INFO (ReportHandler|120) [ReportHandler.tabletReport():404] backend[10004] reports 144 tablet(s). report version: 17105215500450
2024-03-15 17:17:45,495 INFO (ReportHandler|120) [TabletInvertedIndex.tabletReport():303] finished to do tablet diff with backend[10004]. sync: 0. metaDel: 0. f
oundValid: 144. foundInvalid: 0. migration: 0. found invalid transactions 0. found republish transactions 0 cost: 0 ms
just ran it again... so eventually sling's updates stop and then updates in in SR stop. So there is a lag on sling's console.
weird, will look closer, thanks
Hey, did some testing on, and I believe this is happening on the SR side. Possibly a bug in the STREAM LOAD interface?
See attached sr_test.zip file with 2 files: README.md
and strea_load.sh
.
Basically, we're having sling
convert the parquet file into many CSV files, and then attempting to stream load via curl
commands. It's getting stuck as well...
It looks like we have to pass another parameter to starrocks stream load. https://github.com/StarRocks/starrocks/issues/43359#issuecomment-2033694374
Thanks, will add. Although, won't the timeout in the Header fail the job (if reached)? I'll try some test.
v1.2.3
released with changes
Closing this, feel free to re-open.
related to https://github.com/slingdata-io/sling-cli/issues/204. It just stops loading more rows.