msimonin / vagrant-g5k

Hacking around vagrant and g5k
MIT License
3 stars 1 forks source link

Error when starting VM's on lyon #10

Closed Brandonage closed 7 years ago

Brandonage commented 7 years ago

Hi:

I get the following error in only part of the jobs I'm trying to submit with vagrant-g5k. I updated the plugin recently and this is the first time I get this error. Maybe it's connected. It seems like kvm is not accesible on some machines?

==> test-2: + SMP=3
==> test-2: + echo 'SMP = 3'
==> test-2: + shift
==> test-2: + KEEP_SYSTEM_MEM=1
==> test-2: + '[' 10240 == -1 ']'
==> test-2: + VM_MEM=10240
==> test-2: + echo 'VM_MEM = 10240'
==> test-2: + shift
==> test-2: + net=
==> test-2: + '[' BRIDGE == BRIDGE ']'
==> test-2: + shift
==> test-2: ++ net_bridge .vagrant/test-vagrant-g5k/subnet -drive file=/home/abrandon/public/centos_7.2_dcos.qcow2,if=virtio -snapshot
==> test-2: ++ SUBNET_FILE=.vagrant/test-vagrant-g5k/subnet
==> test-2: ++ ipnumber=71
==> test-2: +++ cat .vagrant/test-vagrant-g5k/subnet
==> test-2: +++ tail -n 1
==> test-2: +++ head -n 72
==> test-2: ++ IP_MAC='10.140.72.72 00:16:3E:8C:48:48'
==> test-2: +++ echo 10.140.72.72 00:16:3E:8C:48:48
==> test-2: +++ awk '{print $1}'
==> test-2: ++ IP_ADDR=10.140.72.72
==> test-2: +++ echo 10.140.72.72 00:16:3E:8C:48:48
==> test-2: +++ awk '{print $2}'
==> test-2: ++ MAC_ADDR=00:16:3E:8C:48:48
==> test-2: +++ sudo create_tap
==> test-2: ++ TAP=tap0
==> test-2: ++ echo '-net nic,model=virtio,macaddr=00:16:3E:8C:48:48 -net tap,ifname=tap0,script=no'
==> test-2: + net='-net nic,model=virtio,macaddr=00:16:3E:8C:48:48 -net tap,ifname=tap0,script=no'
==> test-2: ++ hostname
==> test-2: + echo sagittaire-48.lyon.grid5000.fr
==> test-2: + echo -net nic,model=virtio,macaddr=00:16:3E:8C:48:48 -net tap,ifname=tap0,script=no
==> test-2: + shift
==> test-2: + export TMPDIR=/tmp
==> test-2: + TMPDIR=/tmp
==> test-2: + trap clean_shutdown 12
==> test-2: + wait
==> test-2: + kvm -m 10240 -smp cores=3,threads=1,sockets=1 -fsdev local,security_model=none,id=fsdev0,path=/home/abrandon -device virtio-9p-pci,id=fs0,fsdev=fsdev0,mount_tag=hostshare -nographic -monitor unix:/tmp/vagrant-g5k.875925.mon,server,nowait -localtime -enable-kvm -net nic,model=virtio,macaddr=00:16:3E:8C:48:48 -net tap,ifname=tap0,script=no -drive file=/home/abrandon/public/centos_7.2_dcos.qcow2,if=virtio -snapshot
==> test-2: Could not access KVM kernel module: No such file or directory
==> test-2: failed to initialize KVM: No such file or directory
==> test-2: OAR.test-2.875925.stderr:  + '[' 3 == -1 ']'
==> test-2: + SMP=3
==> test-2: + echo 'SMP = 3'
==> test-2: + shift
==> test-2: + KEEP_SYSTEM_MEM=1
==> test-2: + '[' 10240 == -1 ']'
==> test-2: + VM_MEM=10240
==> test-2: + echo 'VM_MEM = 10240'
==> test-2: + shift
==> test-2: + net=
==> test-2: + '[' BRIDGE == BRIDGE ']'
==> test-2: + shift
==> test-2: ++ net_bridge .vagrant/test-vagrant-g5k/subnet -drive file=/home/abrandon/public/centos_7.2_dcos.qcow2,if=virtio -snapshot
==> test-2: ++ SUBNET_FILE=.vagrant/test-vagrant-g5k/subnet
==> test-2: ++ ipnumber=71
==> test-2: +++ cat .vagrant/test-vagrant-g5k/subnet
==> test-2: +++ tail -n 1
==> test-2: +++ head -n 72
==> test-2: ++ IP_MAC='10.140.72.72 00:16:3E:8C:48:48'
==> test-2: +++ echo 10.140.72.72 00:16:3E:8C:48:48
==> test-2: +++ awk '{print $1}'
==> test-2: ++ IP_ADDR=10.140.72.72
==> test-2: +++ echo 10.140.72.72 00:16:3E:8C:48:48
==> test-2: +++ awk '{print $2}'
==> test-2: ++ MAC_ADDR=00:16:3E:8C:48:48
==> test-2: +++ sudo create_tap
==> test-2: ++ TAP=tap0
==> test-2: ++ echo '-net nic,model=virtio,macaddr=00:16:3E:8C:48:48 -net tap,ifname=tap0,script=no'
==> test-2: + net='-net nic,model=virtio,macaddr=00:16:3E:8C:48:48 -net tap,ifname=tap0,script=no'
==> test-2: ++ hostname
==> test-2: + echo sagittaire-48.lyon.grid5000.fr
==> test-2: + echo -net nic,model=virtio,macaddr=00:16:3E:8C:48:48 -net tap,ifname=tap0,script=no
==> test-2: + shift
==> test-2: + export TMPDIR=/tmp
==> test-2: + TMPDIR=/tmp
==> test-2: + trap clean_shutdown 12
==> test-2: + wait
==> test-2: + kvm -m 10240 -smp cores=3,threads=1,sockets=1 -fsdev local,security_model=none,id=fsdev0,path=/home/abrandon -device virtio-9p-pci,id=fs0,fsdev=fsdev0,mount_tag=hostshare -nographic -monitor unix:/tmp/vagrant-g5k.875925.mon,server,nowait -localtime -enable-kvm -net nic,model=virtio,macaddr=00:16:3E:8C:48:48 -net tap,ifname=tap0,script=no -drive file=/home/abrandon/public/centos_7.2_dcos.qcow2,if=virtio -snapshot
==> test-2: Could not access KVM kernel module: No such file or directory
==> test-2: failed to initialize KVM: No such file or directory
msimonin commented 7 years ago

Yes exactly not all nodes in lyon have kvm enabled. Adding the following in the g5k provider option will prevent vagrant-g5k to reserve those nodes :

g5k.oar = "virtual != 'none'"
Brandonage commented 7 years ago

That's the weird thing. I double checked that I had that line in my Vagrantfile. Still it tries to reserve these kind of nodes. I reserved some nodes through a container job last night and it's when it started to do this.

msimonin commented 7 years ago

Have you put the same constraint for the container job (virtual != 'none')?

----- Mail original -----

De: "Alvaro" notifications@github.com À: "msimonin/vagrant-g5k" vagrant-g5k@noreply.github.com Cc: "Matthieu Simonin" matthieu.simonin@inria.fr, "Comment" comment@noreply.github.com Envoyé: Jeudi 15 Juin 2017 09:56:56 Objet: Re: [msimonin/vagrant-g5k] Error when starting VM's on lyon (#10)

That's the weird thing. I double checked that I had that line in my Vagrantfile. Still it tries to reserve these kind of nodes. I reserved some nodes through a container job last night and it's when it started to do this.

-- You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/msimonin/vagrant-g5k/issues/10#issuecomment-308657860

Brandonage commented 7 years ago

Sorry. I didn't explain myself.

For the container job I did the reservation with (virtual != 'none'). That worked fine for that job.

The day after I wanted to do a normal reservation without container job and making sure the g5k.oar = "virtual != 'none'" line in the vagrantfile was there. This is when I started to get the errors.

Brandonage commented 7 years ago

I also have some problems in Rennes. It wont give me any resources for the jobs even if there are plenty.

==> test-13: Waiting for the job to be running
==> test-5: Waiting for the job to be running
==> test-3: Waiting for the job to be running
==> test-8: Waiting for the job to be running
==> test-10: Waiting for the job to be running

Same thing, I launched a container job in this frontend yesterday

Brandonage commented 7 years ago

I close this one since it was probably related with #11