Closed GoogleCodeExporter closed 9 years ago
Hi Joel,
I've run some tests, and I would like to narrow down the scope of this
bug. To my understanding, the current situation with `gkill` is this
one:
* jobs in state NEW remain in state NEW (hence, they will be submitted
if you run `gtsub_control` on the same session once again)
* jobs in any other state are correctly terminated.
Does this describe exactly the problem you're seeing?
Thanks,
Riccardo
Original comment by riccardo.murri@gmail.com
on 25 Jul 2013 at 1:27
Hi Riccardo,
not exactly - we saw that running gkill -s -A would not shutdown VM in
RUNNING state. I have just run gkill (10.45am) on 10 running VM's (3 in
finished state) and seems to be shutting all down. As of 12.02pm I have
4 VM's still running (but I guess will also shutdown). The behaviour we
observed is just a slow shutdown process then? We did not test the
behaviour on VM's in NEW state.
Maybe Tyanko can correct me if I describe this incorrectly.
Cheers,
Joel
currentSim.txt gst200/ sd200/ src_master/
currentSim.txt~ gst200_8411/ spatial/ tair200/
(gc3pie)joel@joel-ThinkPad-E520:~$ gkill -s ~/sim/gst200_8411/ -A
Sent request to cancel job 'GTSubControllApplication.410'.
Sent request to cancel job 'GTSubControllApplication.400'.
Sent request to cancel job 'GTSubControllApplication.405'.
Sent request to cancel job 'GTSubControllApplication.401'.
Sent request to cancel job 'GTSubControllApplication.404'.
Sent request to cancel job 'GTSubControllApplication.403'.
Failed canceling job 'GTSubControllApplication.411': Job
'GTSubControllApplication.411' is already in terminal state
Failed canceling job 'GTSubControllApplication.409': Job
'GTSubControllApplication.409' is already in terminal state
Failed canceling job 'GTSubControllApplication.407': Job
'GTSubControllApplication.407' is already in terminal state
Sent request to cancel job 'GTSubControllApplication.408'.
Sent request to cancel job 'GTSubControllApplication.399'.
Sent request to cancel job 'GTSubControllApplication.402'.
Sent request to cancel job 'GTSubControllApplication.406'.
(gc3pie)joel@joel-ThinkPad-E520:~$ gloud list
gloud: command not found
(gc3pie)joel@joel-ThinkPad-E520:~$ gcloud list
====================================
VMs running on EC2 resource `hobbes`
====================================
+------------+---------+--------------+-------------+--------------+------------
--+---------+
| id | state | public ip | Nr. of jobs | Nr. of cores |
image id | keypair |
+------------+---------+--------------+-------------+--------------+------------
--+---------+
| i-000055a8 | running | 130.60.24.73 | 0 | 8 |
ami-00000085 | joel |
| i-000055a5 | running | 130.60.24.43 | 0 | 8 |
ami-00000085 | joel |
| i-000055a7 | running | 130.60.24.66 | 0 | 8 |
ami-00000085 | joel |
| i-000055a6 | running | 130.60.24.44 | 0 | 8 |
ami-00000085 | joel |
| i-000055aa | running | 130.60.24.75 | 0 | 8 |
ami-00000085 | joel |
| i-000055ac | running | 130.60.24.79 | 0 | 8 |
ami-00000085 | joel |
| i-000055ab | running | 130.60.24.78 | 0 | 8 |
ami-00000085 | joel |
| i-000055ae | running | 130.60.24.81 | 0 | 8 |
ami-00000085 | joel |
| i-000055ad | running | 130.60.24.80 | 0 | 8 |
ami-00000085 | joel |
| i-000055af | running | 130.60.24.82 | 0 | 8 |
ami-00000085 | joel |
+------------+---------+--------------+-------------+--------------+------------
--+---------+
(gc3pie)joel@joel-ThinkPad-E520:~$ gcloud list
====================================
VMs running on EC2 resource `hobbes`
====================================
+------------+---------+--------------+-------------+--------------+------------
--+---------+
| id | state | public ip | Nr. of jobs | Nr. of cores |
image id | keypair |
+------------+---------+--------------+-------------+--------------+------------
--+---------+
| i-000055a8 | running | 130.60.24.73 | 0 | 8 |
ami-00000085 | joel |
| i-000055a5 | running | 130.60.24.43 | 0 | 8 |
ami-00000085 | joel |
| i-000055a7 | running | 130.60.24.66 | 0 | 8 |
ami-00000085 | joel |
| i-000055a6 | running | 130.60.24.44 | 0 | 8 |
ami-00000085 | joel |
| i-000055aa | running | 130.60.24.75 | 0 | 8 |
ami-00000085 | joel |
| i-000055ac | running | 130.60.24.79 | 0 | 8 |
ami-00000085 | joel |
| i-000055ab | running | 130.60.24.78 | 0 | 8 |
ami-00000085 | joel |
| i-000055ae | running | 130.60.24.81 | 0 | 8 |
ami-00000085 | joel |
| i-000055ad | running | 130.60.24.80 | 0 | 8 |
ami-00000085 | joel |
| i-000055af | running | 130.60.24.82 | 0 | 8 |
ami-00000085 | joel |
+------------+---------+--------------+-------------+--------------+------------
--+---------+
(gc3pie)joel@joel-ThinkPad-E520:~$ gcloud list
====================================
VMs running on EC2 resource `hobbes`
====================================
+------------+---------+--------------+-------------+--------------+------------
--+---------+
| id | state | public ip | Nr. of jobs | Nr. of cores |
image id | keypair |
+------------+---------+--------------+-------------+--------------+------------
--+---------+
| i-000055a8 | running | 130.60.24.73 | 0 | 8 |
ami-00000085 | joel |
| i-000055a5 | running | 130.60.24.43 | 0 | 8 |
ami-00000085 | joel |
| i-000055a7 | running | 130.60.24.66 | 0 | 8 |
ami-00000085 | joel |
| i-000055a6 | running | 130.60.24.44 | 0 | 8 |
ami-00000085 | joel |
| i-000055aa | running | 130.60.24.75 | 0 | 8 |
ami-00000085 | joel |
| i-000055ac | running | 130.60.24.79 | 0 | 8 |
ami-00000085 | joel |
| i-000055ae | running | 130.60.24.81 | 0 | 8 |
ami-00000085 | joel |
| i-000055ad | running | 130.60.24.80 | 0 | 8 |
ami-00000085 | joel |
| i-000055af | running | 130.60.24.82 | 0 | 8 |
ami-00000085 | joel |
+------------+---------+--------------+-------------+--------------+------------
--+---------+
(gc3pie)joel@joel-ThinkPad-E520:~$ gcloud list
====================================
VMs running on EC2 resource `hobbes`
====================================
+------------+---------+--------------+-------------+--------------+------------
--+---------+
| id | state | public ip | Nr. of jobs | Nr. of cores |
image id | keypair |
+------------+---------+--------------+-------------+--------------+------------
--+---------+
| i-000055a5 | running | 130.60.24.43 | 0 | 8 |
ami-00000085 | joel |
| i-000055ae | running | 130.60.24.81 | 0 | 8 |
ami-00000085 | joel |
| i-000055ad | running | 130.60.24.80 | 0 | 8 |
ami-00000085 | joel |
| i-000055af | running | 130.60.24.82 | 0 | 8 |
ami-00000085 | joel |
+------------+---------+--------------+-------------+--------------+------------
--+---------+
(gc3pie)joel@joel-ThinkPad-E520:~$
Original comment by joelfiddes
on 26 Jul 2013 at 10:08
Hi Joel,
| not exactly - we saw that running gkill -s -A would not shutdown VM in
| RUNNING state. I have just run gkill (10.45am) on 10 running VM's (3 in
| finished state) and seems to be shutting all down. As of 12.02pm I have
| 4 VM's still running (but I guess will also shutdown). The behaviour we
| observed is just a slow shutdown process then? We did not test the
| behaviour on VM's in NEW state.
Wait, wait :-)
`gkill` kills *jobs*, it does not touch VMs. Indeed, after running
`gkill` you see that non of your VMs is running a job (remote jobs ==
0 for all VMs according to `gcloud list`).
VMs do not shoutdown automatically, because they could be "recycled"
to run another job. To shutdown VMs, you need to:
- either terminate them with the `gcloud terminate` command, e.g.,
`gcloud terminate i-00001234`;
- or use the `gcloud cleanup` command, which terminates all VMs that
are not currently running any job (you need to upgrade to the latest
version of GC3Pie, as I implemented it yesterday);
- or you let the GTSub script decide which VMs to terminate: it will
do that automatically when there are no more jobs to be run.
Now, is your issue with jobs or with VMs? Or both?
Original comment by riccardo.murri@gmail.com
on 26 Jul 2013 at 10:18
Hi Riccardo,
Your explanation clears it up - we can close this ticket as user
misunderstanding. Unless Tyanko has an objection.
Cheers,
Joel
Original comment by joelfiddes
on 26 Jul 2013 at 11:57
No objections from my side.
Ciao,
Tyanko
Original comment by tyanko.a...@gmail.com
on 26 Jul 2013 at 12:13
Agreed that this is a non-issue, then :-)
Original comment by riccardo.murri@gmail.com
on 13 Aug 2013 at 8:59
Original issue reported on code.google.com by
joelfiddes
on 23 Jul 2013 at 1:24