goerz / gc3pie

Automatically exported from code.google.com/p/gc3pie
0 stars 0 forks source link

gkill -A not killing all VM's #406

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. gkill -s SESSION_NAME -A
2.
3.

What is the expected output? What do you see instead?
Expecting all the jobs to be killed and VMs to be stopped. What
happened instead is that certain number of jobs and VMs where killed and 
certain kept going until we run the gkill command again.

What version of the product are you using? On what operating system?
latest as of 22/7/13; Ubuntu 13.04.

Please provide any additional information below.

Original issue reported on code.google.com by joelfiddes on 23 Jul 2013 at 1:24

GoogleCodeExporter commented 9 years ago
Hi Joel,

I've run some tests, and I would like to narrow down the scope of this
bug.  To my understanding, the current situation with `gkill` is this
one:

* jobs in state NEW remain in state NEW (hence, they will be submitted
  if you run `gtsub_control` on the same session once again)

* jobs in any other state are correctly terminated.

Does this describe exactly the problem you're seeing?

Thanks,
Riccardo

Original comment by riccardo.murri@gmail.com on 25 Jul 2013 at 1:27

GoogleCodeExporter commented 9 years ago
Hi Riccardo,

not exactly - we saw that running gkill -s -A would not shutdown VM in 
RUNNING state. I have just run gkill (10.45am) on 10 running VM's (3 in 
finished state) and seems to be shutting all down. As of 12.02pm I have 
4 VM's still running (but I guess will also shutdown). The behaviour we 
observed is just a slow shutdown process then? We did not test the 
behaviour on VM's in NEW state.

Maybe Tyanko can correct me if I describe this incorrectly.

Cheers,

Joel

currentSim.txt   gst200/          sd200/           src_master/
currentSim.txt~  gst200_8411/     spatial/         tair200/
(gc3pie)joel@joel-ThinkPad-E520:~$ gkill -s ~/sim/gst200_8411/ -A
Sent request to cancel job 'GTSubControllApplication.410'.
Sent request to cancel job 'GTSubControllApplication.400'.
Sent request to cancel job 'GTSubControllApplication.405'.
Sent request to cancel job 'GTSubControllApplication.401'.
Sent request to cancel job 'GTSubControllApplication.404'.
Sent request to cancel job 'GTSubControllApplication.403'.
Failed canceling job 'GTSubControllApplication.411': Job 
'GTSubControllApplication.411' is already in terminal state
Failed canceling job 'GTSubControllApplication.409': Job 
'GTSubControllApplication.409' is already in terminal state
Failed canceling job 'GTSubControllApplication.407': Job 
'GTSubControllApplication.407' is already in terminal state
Sent request to cancel job 'GTSubControllApplication.408'.
Sent request to cancel job 'GTSubControllApplication.399'.
Sent request to cancel job 'GTSubControllApplication.402'.
Sent request to cancel job 'GTSubControllApplication.406'.
(gc3pie)joel@joel-ThinkPad-E520:~$ gloud list
gloud: command not found
(gc3pie)joel@joel-ThinkPad-E520:~$ gcloud list

====================================
VMs running on EC2 resource `hobbes`
====================================

+------------+---------+--------------+-------------+--------------+------------
--+---------+
|     id     |  state  |  public ip   | Nr. of jobs | Nr. of cores |   
image id   | keypair |
+------------+---------+--------------+-------------+--------------+------------
--+---------+
| i-000055a8 | running | 130.60.24.73 |      0      |      8 | 
ami-00000085 |   joel  |
| i-000055a5 | running | 130.60.24.43 |      0      |      8 | 
ami-00000085 |   joel  |
| i-000055a7 | running | 130.60.24.66 |      0      |      8 | 
ami-00000085 |   joel  |
| i-000055a6 | running | 130.60.24.44 |      0      |      8 | 
ami-00000085 |   joel  |
| i-000055aa | running | 130.60.24.75 |      0      |      8 | 
ami-00000085 |   joel  |
| i-000055ac | running | 130.60.24.79 |      0      |      8 | 
ami-00000085 |   joel  |
| i-000055ab | running | 130.60.24.78 |      0      |      8 | 
ami-00000085 |   joel  |
| i-000055ae | running | 130.60.24.81 |      0      |      8 | 
ami-00000085 |   joel  |
| i-000055ad | running | 130.60.24.80 |      0      |      8 | 
ami-00000085 |   joel  |
| i-000055af | running | 130.60.24.82 |      0      |      8 | 
ami-00000085 |   joel  |
+------------+---------+--------------+-------------+--------------+------------
--+---------+
(gc3pie)joel@joel-ThinkPad-E520:~$ gcloud list

====================================
VMs running on EC2 resource `hobbes`
====================================

+------------+---------+--------------+-------------+--------------+------------
--+---------+
|     id     |  state  |  public ip   | Nr. of jobs | Nr. of cores |   
image id   | keypair |
+------------+---------+--------------+-------------+--------------+------------
--+---------+
| i-000055a8 | running | 130.60.24.73 |      0      |      8 | 
ami-00000085 |   joel  |
| i-000055a5 | running | 130.60.24.43 |      0      |      8 | 
ami-00000085 |   joel  |
| i-000055a7 | running | 130.60.24.66 |      0      |      8 | 
ami-00000085 |   joel  |
| i-000055a6 | running | 130.60.24.44 |      0      |      8 | 
ami-00000085 |   joel  |
| i-000055aa | running | 130.60.24.75 |      0      |      8 | 
ami-00000085 |   joel  |
| i-000055ac | running | 130.60.24.79 |      0      |      8 | 
ami-00000085 |   joel  |
| i-000055ab | running | 130.60.24.78 |      0      |      8 | 
ami-00000085 |   joel  |
| i-000055ae | running | 130.60.24.81 |      0      |      8 | 
ami-00000085 |   joel  |
| i-000055ad | running | 130.60.24.80 |      0      |      8 | 
ami-00000085 |   joel  |
| i-000055af | running | 130.60.24.82 |      0      |      8 | 
ami-00000085 |   joel  |
+------------+---------+--------------+-------------+--------------+------------
--+---------+
(gc3pie)joel@joel-ThinkPad-E520:~$ gcloud list

====================================
VMs running on EC2 resource `hobbes`
====================================

+------------+---------+--------------+-------------+--------------+------------
--+---------+
|     id     |  state  |  public ip   | Nr. of jobs | Nr. of cores |   
image id   | keypair |
+------------+---------+--------------+-------------+--------------+------------
--+---------+
| i-000055a8 | running | 130.60.24.73 |      0      |      8 | 
ami-00000085 |   joel  |
| i-000055a5 | running | 130.60.24.43 |      0      |      8 | 
ami-00000085 |   joel  |
| i-000055a7 | running | 130.60.24.66 |      0      |      8 | 
ami-00000085 |   joel  |
| i-000055a6 | running | 130.60.24.44 |      0      |      8 | 
ami-00000085 |   joel  |
| i-000055aa | running | 130.60.24.75 |      0      |      8 | 
ami-00000085 |   joel  |
| i-000055ac | running | 130.60.24.79 |      0      |      8 | 
ami-00000085 |   joel  |
| i-000055ae | running | 130.60.24.81 |      0      |      8 | 
ami-00000085 |   joel  |
| i-000055ad | running | 130.60.24.80 |      0      |      8 | 
ami-00000085 |   joel  |
| i-000055af | running | 130.60.24.82 |      0      |      8 | 
ami-00000085 |   joel  |
+------------+---------+--------------+-------------+--------------+------------
--+---------+
(gc3pie)joel@joel-ThinkPad-E520:~$ gcloud list

====================================
VMs running on EC2 resource `hobbes`
====================================

+------------+---------+--------------+-------------+--------------+------------
--+---------+
|     id     |  state  |  public ip   | Nr. of jobs | Nr. of cores |   
image id   | keypair |
+------------+---------+--------------+-------------+--------------+------------
--+---------+
| i-000055a5 | running | 130.60.24.43 |      0      |      8 | 
ami-00000085 |   joel  |
| i-000055ae | running | 130.60.24.81 |      0      |      8 | 
ami-00000085 |   joel  |
| i-000055ad | running | 130.60.24.80 |      0      |      8 | 
ami-00000085 |   joel  |
| i-000055af | running | 130.60.24.82 |      0      |      8 | 
ami-00000085 |   joel  |
+------------+---------+--------------+-------------+--------------+------------
--+---------+
(gc3pie)joel@joel-ThinkPad-E520:~$

Original comment by joelfiddes on 26 Jul 2013 at 10:08

GoogleCodeExporter commented 9 years ago
Hi Joel,

| not exactly - we saw that running gkill -s -A would not shutdown VM in
| RUNNING state. I have just run gkill (10.45am) on 10 running VM's (3 in
| finished state) and seems to be shutting all down. As of 12.02pm I have
| 4 VM's still running (but I guess will also shutdown). The behaviour we
| observed is just a slow shutdown process then? We did not test the
| behaviour on VM's in NEW state.

Wait, wait :-)

`gkill` kills *jobs*, it does not touch VMs.  Indeed, after running
`gkill` you see that non of your VMs is running a job (remote jobs ==
0 for all VMs according to `gcloud list`).

VMs do not shoutdown automatically, because they could be "recycled"
to run another job. To shutdown VMs, you need to:

- either terminate them with the `gcloud terminate` command, e.g.,
`gcloud terminate i-00001234`;

- or use the `gcloud cleanup` command, which terminates all VMs that
  are not currently running any job (you need to upgrade to the latest
  version of GC3Pie, as I implemented it yesterday);

- or you let the GTSub script decide which VMs to terminate: it will
  do that automatically when there are no more jobs to be run.

Now, is your issue with jobs or with VMs? Or both?

Original comment by riccardo.murri@gmail.com on 26 Jul 2013 at 10:18

GoogleCodeExporter commented 9 years ago
Hi Riccardo,

Your explanation clears it up - we can close this ticket as user 
misunderstanding. Unless Tyanko has an objection.

Cheers,

Joel

Original comment by joelfiddes on 26 Jul 2013 at 11:57

GoogleCodeExporter commented 9 years ago
No objections from my side.

Ciao,
Tyanko

Original comment by tyanko.a...@gmail.com on 26 Jul 2013 at 12:13

GoogleCodeExporter commented 9 years ago
Agreed that this is a non-issue, then :-)

Original comment by riccardo.murri@gmail.com on 13 Aug 2013 at 8:59