vesoft-inc / nebula-importer

Nebula Graph Importer with Go
Apache License 2.0
90 stars 60 forks source link

Importer loose session during load #214

Closed goranc closed 1 year ago

goranc commented 2 years ago

I'm testing data load in Nebula 3.2.0 for our dataset which we already have in Nebula cluster 1.2.1 version. I have issue with nebula-importer loosing session during import and can not reconnect, have to stop load and restart again to continue load. Config for Importer is to load data from CSV file using 4 connections per node, on 4 nodes cluster, load is with 16 workers in total and batch size is 16 records per batch.

Here is error I've got on console. (data part is deleted represented with three dots)

2022/08/19 13:21:36 [INFO] statsmgr.go:89: Tick: Time(210.00s), Finished(13407184), Failed(0), Read Failed(0), Latency AVG(3739us), Batches Req AVG(3953us), 
Rows AVG(63843.68/s)
2022/08/19 13:21:41 [INFO] statsmgr.go:89: Tick: Time(215.00s), Finished(13645728), Failed(0), Read Failed(0), Latency AVG(3739us), Batches Req AVG(3953us), 
Rows AVG(63468.49/s)
2022/08/19 13:21:43 [ERROR] handler.go:63: Client 12 fail to execute: INSERT VERTEX `type`(`sample_type`) VALUES  (...) ;, ErrMsg: Get sessionId[1660908190175803] fa
iled: Session `1660908190175803' not found: Session not existed!, ErrCode: -1002
2022/08/19 13:21:46 [INFO] statsmgr.go:89: Tick: Time(220.00s), Finished(13645984), Failed(16), Read Failed(0), Latency AVG(3739us), Batches Req AVG(3953us),
 Rows AVG(62027.09/s)

Any known issues regarding this ?

I'm using nebula-importer 3.1 but will try with 3.2, but I don't see relevant changes in newer version.

g.c.

goranc commented 2 years ago

Tested with 3.2 version of nebula-importer and behavior is the same, after some time session is dropped.

Here are timeouts and interesting graphd config lines used on cluster

# Seconds before the idle connections are closed, 0 for never closed
--client_idle_timeout_secs=604800
# Seconds before the idle sessions are expired, 0 for no expiration
# --session_idle_timeout_secs=60000
--session_idle_timeout_secs=604800
# The number of threads to accept incoming connections
--num_accept_threads=128
# The number of networking IO threads, 0 for # of CPU cores
--num_netio_threads=0
# The number of threads to execute user queries, 0 for # of CPU cores
--num_worker_threads=0
wey-gu commented 2 years ago

@Aiee do you have clues on this issue, please?

goranc commented 2 years ago

Just thinking about nebula-go driver used in importer, it is 3.0, should it be 3.2 according to compatibility list ?

veezhang commented 1 year ago

@goranc Hi, what is the version of nebula-importer ? Does this problem occur every time? And have you tried the latest version?

goranc commented 1 year ago

I've tried with latest version and it seams that it is working more stable.

I have workaround with python wrapper which detects stall and restarts loader, but that shouldn't happen and have to be implemented in loader directly.

Currently I'm migrating cluster to new 3.4.0 version and will check more when data will be ready for import.

veezhang commented 1 year ago

@goranc We are glad that the new version can help you. You use python to monitor the abnormal termination of nebula-importer?

goranc commented 1 year ago

Yes, it is just control script which monitor responses from main script about data processing progress, and restarts loader if no progress is detected.

QingZ11 commented 1 year ago

@goranc hi, I have noticed that the issue you created hasn’t been updated for nearly a month, is this issue been resolved? If not resolved, can you provide some more information? If solved, can you close this issue?

goranc commented 1 year ago

I'll try to use new 4.0 version of nebula-importer and if there is still issue, will open new issue report. Let's close this one.