Open ajaybhatnagar opened 8 years ago
Great suggestions, I'll look into making some changes, thank you!
Is this project still active at all? Thumbs up for @ajaybhatnagar 's request... However, Tornado based on Python (of course), has the GIL and therefore probably cannot take advantage of N cores on your host - as I understand it. Would have to "go multiprocess" to do so.
The project is still active although I currently don't actively use it myself as I don't have a need for it right now. If I would write this today I'd write it in Go as it takes better advantage for multiple cores on your machine (see your comment) and is more easy to deploy (just a single binary).
At any rate, if you have features you'd like to see added (the the pre-calce'd strings which IMO is a great idea) then I'm happy to review and merge PRs but I don't have the time right now to implement anything new myself.
One more thing re: @biggers and the multi-core issue: you can just run multiple processes of the python task to max out your CPU cores, the network and your ES clusters ingestion capacity. I know it's just a workaround but might solve your problems for now.
Is it possible to reduce CPU usage by using predefined strings in memory as field value instead of generating random strings each time? Reason for this request is I observed 100% CPU installation when running this tool. Each random string generation seems to consume CPU cycle. Further , as this is single threaded script, it does not make use of available CPU in multicore nodes. Thus I am not able to fully stress the Elasticsearch nodes. When single thread CPU utilization reaches 100%, latency of indexing increases though CPU, Load, Memory or IOPs are not a bottleneck on ES node. Can the script use multi-threading option?
In addition to just insert, option for updating together with search queries could make it even better to simulate realistic cases.