Closed ghost closed 8 years ago
In status:
Unable to query docker version: Unable to read TLS config: open /home/kakos/.docker/machine/machines/g5k-hadoop-consul/server.pem: no such file or directory
In g5k_cluster file, do you modify DRIVER_OPTS?
DRIVER_OPTS=$(echo "--generic-ssh-key=/home/bzhang/.ssh/id_rsa --generic-ssh-user=root --generic-ssh-port=22")
There is a "bzhang" should also be changed.
Yes, sure, it is kakos now. If I try to create again cluster:
generic driver does not support start
I have one question. For the machines you got from grid5000, which OS you installed? jessie-x64-base of Debian Jessie ?
Yes, as written in readme.
I just have a try. I repeat create_cluster and can meet the similar problem sometimes. It seems random to happen.
Please update docker-machine to 0.6.0 version using below command:
$ curl -L https://github.com/docker/machine/releases/download/v0.6.0/docker-machine-`uname -s-
uname -m` > /home/bzhang/docker-machine && chmod +x /home/bzhang/docker-machine
And please have a try again. Furthermore, I also don't understand why. You must reinstall OS on your machines to refresh the informations left from last installation.
And do not forget to clean your docker-machie informations. :)
Just in case, if you meet the problem that it can't create container and require you to create them manually, this is caused by Bash in Grid5000.
Please modify "nonexistent" to "*" in cluster.sh on line 279.
Sorry for typo, the command to update docker-machine should be:
$ curl -L https://github.com/docker/machine/releases/download/v0.6.0/docker-machine-`uname -s-
uname -m` >/home/bzhang/docker-machine && chmod +x /home/bzhang/docker-machine
There are some problem with the comments. I have update README. Please have a check. :)
Thanks again! Is my problem maybe with my oarsub
command? I have read some more tutorials and I am uncertain.
No, oarsub command only used by Grid5000 to get machines in my opinion. These problems concern Docker and Docker-machine. I checked stackoverflow and Github. These problems always happen randomly. I think the problems are probably caused by docker and docker-machine with their 'generic' driver.
Is there any way to clear docker-machine machines? It throws some error again, but destroy-cluster flag isn't working.
Really, destroy_cluster doesn't work in Grid5000. You can check machines info by command 'docker-machine ls' And then, delete the machines by 'docker-machine rm (machine name)'.
Great! But some error again: Waiting for SSH to be available... Error creating machine: Error detecting OS: Too many retries waiting for SSH to be available. Last error: Maximum number of retries (60) exceeded
I just try in Lyon site, cluster "sagittaire". It works very well. Do you reinstall OS by command "kadeploy3".
Is reinstall command another then install? It was worked first time, but now I cannot run it.
It's the same. You should re-run it in the terminal where you run 'oarsub' command.
Yes, I know, but it says now
You do not have sufficient rights to perform the operation on all the nodes [Kadeploy Error #6]
and I cannot saved original command, I have found earlier, but forgot exact method. Now I try: kadeploy3 -e jessie-x64-base -m sagittaire-[9,43,74].lyon.grid5000.fr
Maybe you should use this command "kadeploy3 -e jessie-x64-base -f $OAR_FILE_NODES -k ~/.ssh/id_rsa.pub"
This is much better to directly indicate the machine names.
Two, maybe last things:
echo "$OAR_FILE_NODES"
gives empty output, but I see previously mentined nodes.The file -k cannot be read
, but I can cat
the file.
I am on Grid5000 and have a question, again. I have got error message above for command:
CONFIG=g5k_cluster ./cluster.sh create-cluster
I have already set IP addresses and added home to Path: /usr/local/bin:/usr/bin:/bin:/grid5000/code/bin:/home/kakos
How should I fix this?