Closed gw0 closed 8 years ago
Could we get a update on this? I would like to try Cassandra 3.0 if cassandra-loader is compatible with 3.0. I loaded data using this tool in Cassandra 2.1.5 and it's very fast.
@xqchen1 can you please try this branch https://github.com/al3xandru/cassandra-loader/tree/issue-25 and report back? thanks
Thanks Alex. I finally got a chance to test the Cassandra Loader with C* 3.3 on a new cluster. It's working great!!!!!
Thanks again for all your help.
how fast is it
@hzliang if your question is about how fast the loader is, there are way too many variables to take into account to talk any numbers. It starts with the size of your cluster, the level of parallelism the machine running the loader can handle, the size of each "record", and it goes all the way to tuning different parameters on both server side and the loader.
Thank you very much for your replying.my cassandra2.2.0 and six nodes,but i found 2000rows/min when using cassandra-load,it is too slow for me to load 3T data.来自我的华为手机-------- 原始邮件 --------主题:Re: [brianmhess/cassandra-loader] Does not work with Cassandra 3.0 (#25)发件人:Alex Popescu 收件人:brianmhess/cassandra-loader 抄送:hzliang @hzliang if your question is about how fast the loader is, there are way too many variables to take into account to talk any numbers. It starts with the size of your cluster, the level of parallelism the machine running the loader can handle, the size of each "record", and it goes all the way to tuning different parameters on both server side and the loader.
—You are receiving this because you were mentioned.Reply to this email directly or view it on GitHub
I loaded 62 million rows of data to a table with 150 columns. The load rate was about 8200 rows per second on a 4 node cluster with 16 cores (32 logical)/SSDs/264GB memory. In a small cluster of 4 VM nodes , my load rate was about 1000 rows per second with 1 thread. You want to create smaller CSV files so you can run your job in parallel. If compaction is falling behind, you may want to lower number of threads. You need to monitor where is the bottleneck. It's a tuning process.
I didn't close this issue, but yes we support 3.0 now.
Because Cassandra 3.0 changed some internal tables and older versions of drivers try to access them, both tools crash when trying to connect. Please update the Java driver.