Open adnavare opened 5 years ago
I haven't used static machines recently so I don't have much advice there, but I took a look at the log and think it may be blocking on a different line.
I see the log line of: 2018-12-05 14:44:54,182 d472e2a2 Thread-30 cassandra_ycsb(1/1) vm_util.py:348 DEBUG Ran: {ssh -A -p 22 ... ... mkdir -p /opt/pkb/ycsb && curl -L https://github.com/brianfrankcooper/YCSB/releases/download/0.9.0/ycsb-0.9.0.tar.gz | tar -C /opt/pkb/ycsb --strip-components=1 -xzf -} ReturnCode:0
This log line means that this command 'ran', so I suspect the download completed. Then, about 14 minutes later, there is a SIGINT. There is no intermediate logging.
Earlier, on the other machine, there is a log line for running an SSH command that it looks like never completes:
2018-12-05 14:43:03,525 d472e2a2 Thread-31 cassandra_ycsb(1/1) vm_util.py:297 INFO Running: ssh -A -p 22 ... ... mkdir -p /opt/pkb && cd /opt/pkb && wget archive.apache.org/dist/ant/binaries/apache-ant-1.9.6-bin.tar.gz && tar -zxf apache-ant-1.9.6-bin.tar.gz && ln -s /opt/pkb/apache-ant-1.9.6/ /opt/pkb/ant
What happens if you run that locally on the machine or via SSH as is done here? Does that complete?
Note that the benchmark is using multiple threads so it can setup both machines in parallel.
@s-deitz : Thanks for taking a look at it. Yes I tried to run locally with the same commands from my client machine and it succeeds, not sure why within the process it is not happening. Also both these packages - ycsb0.9.0 and apache-ant does succeed on the same client-server setup if I run mongodb with YCSB.
Are you able to run the whole ssh command from the machine you launched PKB on as well? I think there are three things to try:
If 2 and 3 succeed, but 1 fails, then there must be some difference between how the command runs on the client when we ssh via pkb versus when you ssh outside pkb.
It might be useful to break up the command into separate RemoteCommand invocations to see if the wget or the tar are failing. Then you might also print out any environment variables in the client machine to see if there is a difference between your ssh and pkb ssh.
@s-deitz: So I tried these things
-vL https://github.com/brianfrankcooper/YCSB/releases/download/0.9.0/ycsb-0.9.0.tar.gz &> /dev/stdout | tar -C /opt/pkb/ycsb -xvzf - &> /dev/stdout STDOUT: gzip: stdin: not in gzip format
any clue what might be going wrong?
Is Cassandra supported in 0.9.0? https://github.com/brianfrankcooper/YCSB/issues/766
I ran PKB on an Ubuntu 16.04 and an Ubuntu 18.04 instance launched by PKB and they both worked.
The flags I used were:
--benchmarks=cassandra_ycsb --num_vms=1 --cassandra_replication_factor=1 --ycsb_client_vms=1 --os_type=ubuntu1604 --machine_type=n1-standard-8 --data_disk_type=pd-ssd --gce_num_local_ssds=0
Changing ubuntu1604 to ubuntu1804 still worked.
The log line of the successful remote command that was reported as failing is:
2018-12-11 13:33:39,984 7aef774d Thread-53 cassandra_ycsb(1/1) vm_util.py:348 DEBUG Ran: {ssh -A -p 22 perfkit@107.178.211.84 -2 -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o IdentitiesOnly=yes -o PreferredAuthentications=publickey -o PasswordAuthentication=no -o ConnectTimeout=5 -o GSSAPIAuthentication=no -o ServerAliveInterval=30 -o ServerAliveCountMax=10 -i /tmp/perfkitbenchmarker/runs/7aef774d/perfkitbenchmarker_keyfile mkdir -p /opt/pkb/ycsb && curl -L https://github.com/brianfrankcooper/YCSB/releases/download/0.9.0/ycsb-0.9.0.tar.gz | tar -C /opt/pkb/ycsb --strip-components=1 -xzf -} ReturnCode:0, WallTime:0:12.56s, CPU:0.02s, MaxMemory:5436kb STDOUT: STDERR: Warning: Permanently added '107.178.211.84' (ECDSA) to the list of known hosts. % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed ^M 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0^M100 605 0 605 0 0 3517 0 --:--:-- --:--:-- --:--:-- 3517 ^M 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0^M 6 326M 6 20.7M 0 0 15.2M 0 0:00:21 0:00:01 0:00:20 21.1M^M 15 326M 15 49.3M 0 0 20.8M 0 0:00:15 0:00:02 0:00:13 24.7M^M 24 326M 24 79.6M 0 0 23.7M 0 0:00:13 0:00:03 0:00:10 26.7M^M 34 326M 34 112M 0 0 25.7M 0 0:00:12 0:00:04 0:00:08 28.1M^M 44 326M 44 145M 0 0 27.2M 0 0:00:11 0:00:05 0:00:06 29.2M^M 55 326M 55 179M 0 0 28.3M 0 0:00:11 0:00:06 0:00:05 31.8M^M 65 326M 65 214M 0 0 29.2M 0 0:00:11 0:00:07 0:00:04 33.2M^M 76 326M 76 249M 0 0 29.8M 0 0:00:10 0:00:08 0:00:02 33.8M^M 86 326M 86 283M 0 0 30.3M 0 0:00:10 0:00:09 0:00:01 34.3M^M 97 326M 97 318M 0 0 30.7M 0 0:00:10 0:00:10 --:--:-- 34.5M^M100 326M 100 326M 0 0 30.8M 0 0:00:10 0:00:10 --:--:-- 34.5M
It looks like you are running on VMs in GCP. Does it work if you have PKB provision the VMs instead? If so, this may be a good way to debug the issue.
One other question: Are you running from master or at the last release? I tried from master.
@flint-dominic : I tried with 0.11.0, but it fails with different error while downloading Apache-ant. (http://archive.apache.org/dist/ant/binaries/apache-ant-1.9.6-bin.tar.gz) and throwing "Connection Timeout". I tried running the command for the above download, like ssh -A -p 22 user@x.x.x.x -2 -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o IdentitiesOnly=yes -o PreferredAuthentications=publickey -o PasswordAuthentication=no -o ConnectTimeout=5 -o GSSAPIAuthentication=no -o ServerAliveInterval=30 -o ServerAliveCountMax=10 -i /home/anupn/.ssh/id_rsa mkdir -p /opt/pkb && cd /opt/pkb && wget archive.apache.org/dist/ant/binaries/apache-ant-1.9.6-bin.tar.gz && tar -zxf apache-ant-1.9.6-bin.tar.gz && ln -s /opt/pkb/apache-ant-1.9.6/ /opt/pkb/ant And it works properly. Somehow from the PKB it is giving me connection timeout.
@s-deitz : Why do i have to give machine_type, data_disk_type, gce_num_local_ssds when I am using static config file and static machine not VM? I am not running on the VMs in GCP
I tried from master.
@adnavare Try going to your instance and downloading the Cassandra tarball: $ curl -L -O https://github.com/brianfrankcooper/YCSB/releases/download/0.9.0/ycsb-0.9.0.tar.gz and seeing if the tarfile looks okay: $ tar -tvzf ycsb-0.9.0.tar.gz
If you don't supply the "-L" to curl it won't follow redirects and the downloaded file will be HTML staying "you are being redirected..."
Do you know if you have to go through an http proxy server to get out to the internet? That could also stop you from getting the file.
Let me know if you need any additional help with this.
I am running cassandra against YCSB, and I have two different machines, one acting as the server and running Ubuntu 18.10, while client where YCSB will be installed is running Ubuntu 16.04.
I run default version of YCSB_cassandra i.e. 0.9.0, with static config file. Here is the command $ ./pkb.py --benchmarks=cassandra_ycsb --benchmark_config_file=/home/anupn/baremetal-static.yaml --num_vms=1 --cassandra_replication_factor=1 --ycsb_client_vms=1
It fails on downloading the YCSB 0.9.0.tar, but if I try downloading it outside the process it gets downloaded quickly. I am copying my .yaml file as well as log file
pkb.log baremetal-static.txt
I waited for more than 15 minutes, and then killed it. With the same setup if I run MongoDB with YCSB it is able to run the test properly. I looked at the code, and it looks the way server and clients are prepared in MongoDB is exactly same as in Cassandra. Pointers would be really appreciated.