Open kerven88 opened 2 years ago
i have a test on 500k document . very quick. take 30s. the log above:
2021/11/25 14:30:52 total exported 430000 items; total_raw_bytes: 334.99 MB
2021/11/25 14:30:52 total exported 440000 items; total_raw_bytes: 342.68 MB
2021/11/25 14:30:53 total exported 450000 items; total_raw_bytes: 350.58 MB
2021/11/25 14:30:54 total exported 460000 items; total_raw_bytes: 358.24 MB
2021/11/25 14:30:54 total exported 470000 items; total_raw_bytes: 365.94 MB
2021/11/25 14:30:55 total exported 480000 items; total_raw_bytes: 373.64 MB
2021/11/25 14:30:55 total exported 490000 items; total_raw_bytes: 381.29 MB
2021/11/25 14:30:56 total exported 500000 items; total_raw_bytes: 389.01 MB;
2021/11/25 14:30:56 export time spend 30.310132739s
[root@upretsops tmp]# ll -h my_index.json
-rw-r--r-- 1 root root 390M 11月 25 14:30 my_index.json
[root@upretsops tmp]# more my_index.json
{"ID":"163820688","RawData":{"id":163820688,"asset":{"animationUrl":null,"addre":"0x7bd29408f11d2bfc23c34f18275bbf23bb716bc7","nftI
d":536535,"description":"Meebit #9270","id":23305795,"imageUrl":"https://lh3.googleusercontent.com/cQgwa0ZLZgw5L7pDZA4kjSiblU51fML0
HmwiI9dA2dOUZxcn1YAbvT4kEqGsvSInLGGGKFvWEGxne_pPw_qQF5pJ6lkiIufnxlt-jw","name":"Meebit #9270","tokenId":"9270"},"nftId":536535,"cha
in":"ETHEREUM","eventType":"OFFER_ENTERED","eventTimestamp":"2021-05-16T15:11:50.626499Z","unitPrice":"630000000000000000","price":
0.63,"quantity":"1","from":"0x33d27f0bb797de3d20e519f4a192b7570c56681b","to":"","seller":"","winner":"","tx":"","it":"2021-05-16T23
:12:00.795549594+08:00"}}
....
....
is your comand run on the elasticsearch-server local and with the 127.0.0.1 local-ip?
yes. I run command on the elasticsearch-server local and with the 127.0.0.1 local-ip.
is your server memory enough?
I have a test another index ,the document more big , about 20M per 10000 docs.
[root@test tmp]# ./esdump export --es http://localhost:9200 --index tokens --gzip=false -o /tmp/my_index.json
2021/11/25 15:47:28 binary version: unknown version build Time: unknown time
2021/11/25 15:47:28 execute export
2021/11/25 15:47:28 export index tokens to /tmp/my_index.json
2021/11/25 15:47:28 init elasticsearch Client with url: http://localhost:9200
2021/11/25 15:47:28 export match: {"match_all":{}}
2021/11/25 15:47:29 total exported 10000 items; total_raw_bytes: 23.17 MB; esFetchtime 0.896356
2021/11/25 15:47:30 total exported 20000 items; total_raw_bytes: 36.91 MB; esFetchtime 0.616517
2021/11/25 15:47:31 total exported 30000 items; total_raw_bytes: 53.67 MB; esFetchtime 0.679568
2021/11/25 15:47:32 total exported 40000 items; total_raw_bytes: 69.70 MB; esFetchtime 0.645046
2021/11/25 15:47:32 total exported 50000 items; total_raw_bytes: 85.60 MB; esFetchtime 0.640776
2021/11/25 15:47:33 total exported 60000 items; total_raw_bytes: 104.32 MB; esFetchtime 0.709815
2021/11/25 15:47:35 total exported 70000 items; total_raw_bytes: 155.51 MB; esFetchtime 1.396025
2021/11/25 15:47:35 total exported 71433 items; total_raw_bytes: 166.76 MB; esFetchtime 0.338897
2021/11/25 15:47:36 export time spend 7.400013709s
the tow tests. esdump speed is 13-23M/s .
please download the new release, have a new test. show me part of the output log.
Server Resources: [root@ ~]# nproc 8 [root@ ~]# free -h total used free shared buff/cache available Mem: 61G 48G 352M 873M 13G 12G Swap: 0B 0B 0B
Pull the latest code build version: [root@ ~]# /usr/local/bin/esdump version 2021/11/25 09:05:09 binary version: unknown version build Time: unknown time 2021/11/25 09:05:09 execute version 2021/11/25 09:05:09 version time spend 1.398µs
dump log: [root@ ~]# /usr/local/bin/esdump export \
--es "http://127.0.0.1:9200" \ --index "forseti-r-20211124" \ --gzip=true \ --o "/tmp/forseti-20211124.json.gz" \ --MatchBody ' { "range": { "activity.S_T_VD_EVENTOCCURDATE.raw": { "gte": "2021-11-24 00:00:00", "lt": "2021-11-24 00:30:00" } } }'
2021/11/25 09:01:42 binary version: unknown version build Time: unknown time 2021/11/25 09:01:42 execute export 2021/11/25 09:01:42 export index forseti-r-20211124 to /tmp/forseti-20211124.json.gz 2021/11/25 09:01:42 init elasticsearch Client with url: http://127.0.0.1:9200 2021/11/25 09:01:42 export match: { "range": { "activity.S_T_VD_EVENTOCCURDATE.raw": { "gte": "2021-11-24 00:00:00", "lt": "2021-11-24 00:30:00" } } } 2021/11/25 09:02:03 total exported 10000 items; total_raw_bytes: 98.57 MB 2021/11/25 09:02:23 total exported 20000 items; total_raw_bytes: 196.10 MB 2021/11/25 09:02:37 total exported 30000 items; total_raw_bytes: 293.55 MB 2021/11/25 09:02:48 total exported 34214 items; total_raw_bytes: 334.53 MB;the gzip size: 87.01 MB 2021/11/25 09:02:48 export time spend 1m5.814624223s
The indicator "esFetchtime" seems to be missing in my export log ?
your version is old. i have test a lot local . 90% time spend on fetch es data.
Can you print out the version number? The latest master branch I pulled 2021/11/25 09:05:09 binary version: unknown version build Time: unknown time
Or can you upload a latest binary version for me to test?
https://github.com/wxf4150/esdump/releases download here
the version str is inject by github-workflows. when push a tag, github-workflows use ".github/workflows/release.yml" file build the project ,and inject push-tag as version.
i have release a new version again: change "fetch data" in a new thread. store data process will not block fetch data from .
the export performance now increase 10%
2021/11/25 17:19:18 total exported 400000 items; total_raw_bytes: 312.31 MB; storeTime 0.000000
2021/11/25 17:19:18 total exported 410000 items; total_raw_bytes: 319.91 MB; storeTime 0.000000
2021/11/25 17:19:19 total exported 420000 items; total_raw_bytes: 327.42 MB; storeTime 0.000000
2021/11/25 17:19:19 total exported 430000 items; total_raw_bytes: 334.99 MB; storeTime 0.000000
2021/11/25 17:19:20 total exported 440000 items; total_raw_bytes: 342.68 MB; storeTime 0.000000
2021/11/25 17:19:20 total exported 450000 items; total_raw_bytes: 350.58 MB; storeTime 0.000000
2021/11/25 17:19:21 total exported 460000 items; total_raw_bytes: 358.24 MB; storeTime 0.000000
2021/11/25 17:19:21 total exported 470000 items; total_raw_bytes: 365.94 MB; storeTime 0.000000
2021/11/25 17:19:22 total exported 480000 items; total_raw_bytes: 373.64 MB; storeTime 0.000000
2021/11/25 17:19:22 total exported 490000 items; total_raw_bytes: 381.29 MB; storeTime 0.000000
2021/11/25 17:19:23 1000 pages FetchTime 5.236988501999992 s
2021/11/25 17:19:23 totalFetchTime 27.13077194699997 s
2021/11/25 17:19:23 total exported 500000 items; total_raw_bytes: 389.01 MB; storeTime 0.000000
2021/11/25 17:19:23 total exported 500000 items; total_raw_bytes: 389.01 MB; storeTime 0.000000
2021/11/25 17:19:23 export time spend 27.954677496s
the log "2021/11/25 17:19:23 totalFetchTime" is all the es fetch time.
maybe the fetch process should parallel,with more go-routines. but i have no idea , the go-es-client fetch cursor is not a number (page-number).
Thank you very much, I'll test it later
It doesn't look like my export times are continuous, what could be the reason? But your time is continuous.
The export speed on my side does seem to be much slower.
2021/11/25 09:52:50 total exported 10000 items; total_raw_bytes: 97.41 MB; storeTime 0.000000 2021/11/25 09:53:13 total exported 20000 items; total_raw_bytes: 194.95 MB; storeTime 0.000000 2021/11/25 09:53:30 total exported 30000 items; total_raw_bytes: 293.59 MB; storeTime 0.000000 2021/11/25 09:53:50 total exported 40000 items; total_raw_bytes: 391.61 MB; storeTime 0.000000 2021/11/25 09:54:00 total exported 50000 items; total_raw_bytes: 489.30 MB; storeTime 0.000000 2021/11/25 09:54:08 totalFetchTime 95.76896522299998 s 2021/11/25 09:54:08 total exported 57985 items; total_raw_bytes: 566.50 MB; storeTime 0.000000 2021/11/25 09:54:08 export time spend 1m35.826526269s
@kerven88
Can you print out the version number? The latest master branch I pulled 2021/11/25 09:05:09 binary version: unknown version build Time: unknown time
git pull --tags # will pull all the tags then use this command "make linux-amd64" will build a the project with the latest tag as version str.
see the Makefile in the project for more detail.
The export speed on my side does seem to be much slower. the result export time spend 1m35.826526269s ,totalFetchTime 95.7; all most 100% time are spend on fetch es data.
57985 items will fetch 579 times from the es-server. 95s/579 = 0.164s . every fetch-request spend160ms, you es-api is slow.
export-size 566.50 MB / 95s= 5.9M/s. my all test at local: with small document the speed is 14M/s, with more big document the will up to 27M/s .
is your es-server-ip is local-ip? or corss a 100M-LAN network? .
es-server-ip is local-ip. It is possible that it could be a stress issue with my es cluster?
es-server-ip is local-ip. It is possible that it could be a stress issue with my es cluster?
i was test on a single es-node, not a cluster.
may be you "cluster network" is slow .
you can test it with a scp command. scp a big-file from one server to another.
or other cluster issue.
用中文吧。 真累
哈哈,原来是国人,逗死了
我是在es本节点,请求本机ip导出的,按理说不存在网络带宽问题,有可能是es集群负载压力问题。
Exporting 7G of data took 30 minutes.
2021/11/25 03:13:49 execute export 2021/11/25 03:44:16 total exported 830000 items; total_raw_bytes: 7678.78 MB