Open zhaogxd opened 8 years ago
How old is your code? The latest version of fab_cassandra.py doesn't have Line 415. Can you please update the code and retry?
You are right that my version of fab_cassandra.py was an old version since I just ran 'sudo pip install cstar_perf.tool' command in my home folder. It seems that the pip just grab an old version and get it installed. I am not familiar with pip.
In order to get the latest version of cstar_perf, I ran 'git clone' command to get the latest version of cstar_perf downloaded to my local path, then I run 'sudo pip install ./cstar_perf/tool' to get the latest version of cstar_perf installed.
Then, I ran 'cstar_perf_bootstrap -v apache/cassandra-2.1'. This time, the command went much further, and seems to be able to launch the Cassandra on cnode1. But by the end, I got following error:
**[192.168.188.71] out: 127.0.0.1 rack1 Up Normal 102.81 KB 100.00% 8938400078857263027 [192.168.188.71] out: 127.0.0.1 rack1 Up Normal 102.81 KB 100.00% 9038662740528651599 [192.168.188.71] out: 127.0.0.1 rack1 Up Normal 102.81 KB 100.00% 9091135296963166026 [192.168.188.71] out: 127.0.0.1 rack1 Up Normal 102.81 KB 100.00% 9106912787709865125 [192.168.188.71] out: 127.0.0.1 rack1 Up Normal 102.81 KB 100.00% 9111136576899851738 [192.168.188.71] out: [192.168.188.71] out: Warning: "nodetool ring" is used to output all the tokens of a node. [192.168.188.71] out: To view status related info of a node use "nodetool status" instead. [192.168.188.71] out: [192.168.188.71] out: [192.168.188.71] out: [192.168.188.71] Node is not up (yet): 192.168.188.71 [192.168.188.71] waiting 10 seconds to try again..
Fatal error: Timed out waiting for all nodes to startup
Aborting. WARNING:benchmark:'NoneType' object has no attribute 'split'
Fatal error: Cassandra is not up!
Aborting.**
It seems that the Cassandra launched on cnode1 is listening to 127.0.0.1 rather than the static IP address 192.168.188.71. Is this the cause of the failure? If so, how do I tell the Cassandra node to listen to 192.168.188.71?
can you make sure that /etc/hosts is properly configured and that hostname resolution works? Maybe looking through the logic in https://github.com/datastax/cstar_perf/blob/master/tool/cstar_perf/tool/benchmark.py might help debugging your problem
My /etc/hosts file has following content on both cstress1 and cnode1: (my Linux box is CentOS7)
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.188.10 stress1 192.168.188.11 cnode1
The hostname works on both nodes.
Note: You may notice that the ip address in my first post is 192.168.188.71 rather than 192.168.188.10 since I tried to install this tool in two clusters, but the results are the same.
I also looked into the benchmark.py, but found no clue.
I still think that the direct reason of the failure could be the 'listen_address' is set to 'localhost' in the cassandra.yaml file on cnode1 rather than 192.168.188.11, but I don't know how to let the cstar_perf to set the 'listen_address' on cnode1 properly.
Any comments would be appreciated!
@csplinter @mshuler any ideas?
tool/cstar_perf/tool/fab_common.py sets cass_yaml['listen_address'] = cfg['internal_ip']
and I looked at one of our running clusters, which looks really similar to the yaml posted above. I don't have any hosts entries at all.. Since this started out with an old code install, were both the frontend and client source updated to current master @zhaogxd? Just trying to rule out the obvious. Starting fresh and working through the steps without old code, table data, etc. around might be the worst case.
I just ran 'cstar_perf_bootstrap -v apache/cassandra-2.1' on a fresh Centos7, 2 node cluster with the following cluster_config.json and everything came up as expected using the internal_ip. @zhaogxd I am not sure what is causing your problem but I would double check your cluster_config.json located at .cstar_perf/cluster_config.json to make sure the internal_ip is set to your static ip address that you want and not 127.0.0.1
{
"user": "chris",
"cluster_name": "centos7_cluster",
"product": "cassandra",
"saved_caches_directory": "/var/lib/cassandra/saved_caches",
"commitlog_directory": "/var/lib/cassandra/commitlog",
"log_dir": "/var/log/cassandra",
"data_file_directories": ["/var/lib/cassandra/data"],
"block_devices": ["/dev/vda1"],
"blockdev_readahead": "8192",
"hosts": {
"ip-10-200-179-220": {
"internal_ip": "10.200.179.220",
"hostname": "ip-10-200-179-220",
"seed": "true"}
}
}
...
[10.200.179.220] out: 10.200.179.220 rack1 Up Normal 51.67 KB 100.00% 8619020829015697437
[10.200.179.220] out: 10.200.179.220 rack1 Up Normal 51.67 KB 100.00% 8671272831602522107
[10.200.179.220] out: 10.200.179.220 rack1 Up Normal 51.67 KB 100.00% 8700554652956383716
[10.200.179.220] out: 10.200.179.220 rack1 Up Normal 51.67 KB 100.00% 8780708402645289080
[10.200.179.220] out: 10.200.179.220 rack1 Up Normal 51.67 KB 100.00% 9012052654952196740
[10.200.179.220] out: 10.200.179.220 rack1 Up Normal 51.67 KB 100.00% 9054345547042666849
[10.200.179.220] out: 10.200.179.220 rack1 Up Normal 51.67 KB 100.00% 9138156284636759154
[10.200.179.220] out:
[10.200.179.220] out: Warning: "nodetool ring" is used to output all the tokens of a node.
[10.200.179.220] out: To view status related info of a node use "nodetool status" instead.
[10.200.179.220] out:
[10.200.179.220] out:
[10.200.179.220] out:
[10.200.179.220] All nodes available!
INFO:benchmark:Started cassandra on 1 nodes with git SHAs: {u'ip-10-200-179-220': 'cb14186f8d6c2d1105a51e409c59a4e424958171', 'chris@ip-10-200-179-22': 'cb14186f8d6c2d1105a51e409c59a4e424958171'}
@zhaogxd any luck?
Hi Eduard,
Thanks for the follow-up! I am recently busying on something else and have to put this test to a lower priority. I will continue my test on this tool when I got time and let you know my progress for sure.
Have a nice day!
Guang
On Fri, Jul 22, 2016 at 10:45 AM, Eduard Tudenhöfner < notifications@github.com> wrote:
@zhaogxd https://github.com/zhaogxd any luck?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/datastax/cstar_perf/issues/223#issuecomment-234609205, or mute the thread https://github.com/notifications/unsubscribe-auth/AETC_f_19PC9eed6jgBLxQjwmCV-mZ_bks5qYQGdgaJpZM4I5YB_ .
I have been following the steps addressed in "Setup cstar_perf.tool" to setup a test cluster. Following error is encountered when running "cstar_perf_bootstrap apache/cassandra-2.1".
[cnode1] Executing task 'start' !!! Parallel execution exception under host u'cnode1': Process cnode1: Traceback (most recent call last): File "/usr/lib64/python2.7/multiprocessing/process.py", line 258, in _bootstrap self.run() File "/usr/lib64/python2.7/multiprocessing/process.py", line 114, in run self._target(_self._args, _self._kwargs) File "/usr/lib64/python2.7/site-packages/fabric/tasks.py", line 242, in inner submit(task.run(_args, _kwargs)) File "/usr/lib64/python2.7/site-packages/fabric/tasks.py", line 174, in run return self.wrapped(_args, _kwargs) File "/usr/lib64/python2.7/site-packages/fabric/decorators.py", line 181, in inner return func(_args, _kwargs) File "/usr/lib/python2.7/site-packages/cstar_perf/tool/fab_cassandra.py", line 415, in start cfg = config['hosts'][fab.env.host] KeyError: u'192.168.188.11'
Fatal error: One or more hosts failed while executing task 'start'
Underlying exception: u'192.168.188.11'
Aborting.
My cluster_config.json file is created as below:
{ "commitlog_directory": "/mnt/d1/commitlog", "data_file_directories": [ "/mnt/d2/data", "/mnt/d3/data", "/mnt/d4/data" ], "block_devices": [ "/dev/sda" ], "blockdev_readahead": "8192", "hosts": { "cnode1": { "internal_ip": "192.168.188.11", "hostname": "cnode1", "seed": true } }, "user": "hadoopuser", "name": "mycluster", "saved_caches_directory": "/mnt/d2/saved_caches" }
Any suggestions?