abhilekhsingh / gc3pie

Automatically exported from code.google.com/p/gc3pie
0 stars 0 forks source link

shellcmd resource is not disabled when GNU time is not available (GC3Pie 2.3) #494

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
I'm running into the traceback below, which turns out to be caused by GNU time 
not being available (I'm running GC3Pie on OS X). Here's what I found in the 
GC3Pie log:

CRITICAL Unable to find GNU `time` installed on your system. Please, install 
GNU time and set the `time_cmd` configuration option in gc3pie.conf.
Opening wrapper file 
/var/folders/8s/_frgh9sj6m744mxt5w5lyztr0000gn/T/eb-7WgOPn/eb-Lgk2RR/gc3libs.ENU
h17/.gc3pie_shellcmd/resource_usage.txt raised an exception: Could not open 
file 
'/var/folders/8s/_frgh9sj6m744mxt5w5lyztr0000gn/T/eb-7WgOPn/eb-Lgk2RR/gc3libs.EN
Uh17/.gc3pie_shellcmd/resource_usage.txt' on host localhost: IOError: [Errno 2] 
No such file or directory: 
'/var/folders/8s/_frgh9sj6m744mxt5w5lyztr0000gn/T/eb-7WgOPn/eb-Lgk2RR/gc3libs.EN
Uh17/.gc3pie_shellcmd/resource_usage.txt'
and this is probably the cause:

Traceback I ran into because of this:

    self._engine.progress()
  File "build/bdist.macosx-10.10-intel/egg/gc3libs/core.py", line 1272, in progress
    self._core.update_job_state(task)
  File "build/bdist.macosx-10.10-intel/egg/gc3libs/core.py", line 415, in update_job_state
    **extra_args)
  File "build/bdist.macosx-10.10-intel/egg/gc3libs/core.py", line 451, in __update_application
    raise ex
InvalidArgument: Job 'Application@108f23c90' refers to process wrapper 79786 
which ended unexpectedly

Original issue reported on code.google.com by kenneth....@gmail.com on 2 Jul 2015 at 12:33

GoogleCodeExporter commented 9 years ago
I think this is now fixed in SVN r4278, can you please check?

Original comment by riccardo.murri@gmail.com on 3 Jul 2015 at 4:15

GoogleCodeExporter commented 9 years ago
Hmmmm.... no, there's still a problem when only one resource is available: that 
activates a "shortcut path" through the code and skips the checks...

Original comment by riccardo.murri@gmail.com on 3 Jul 2015 at 4:20

GoogleCodeExporter commented 9 years ago
Now it should really be fixed in SVN r4280

Original comment by riccardo.murri@gmail.com on 3 Jul 2015 at 5:05

GoogleCodeExporter commented 9 years ago
I'm not seeing the InvalidArgument problem anymore, but I can still select the 
resource via Engine.select_resource; is that what's supposed to happen?

Basically, what I have now is that the task that is submitted to the shellcmd 
backend never starts running, and so GC3Pie just keeps looping checking the 
output status of jobs...

Is there any way to register a timeout in GC3Pie, after which it should be done 
with tasks, and if not, give up and exit?

Original comment by kenneth....@gmail.com on 3 Jul 2015 at 7:18

GoogleCodeExporter commented 9 years ago
Can you please test SVN r4282 ?

Original comment by riccardo.murri@gmail.com on 3 Jul 2015 at 8:55

GoogleCodeExporter commented 9 years ago
I'm now setting this:

  File "easybuild/tools/parallelbuild.py", line 112, in build_easyconfigs_in_parallel
    live_job_backend.complete()
  File "easybuild/tools/job/gc3pie.py", line 258, in complete
    self._engine.progress()
  File "build/bdist.macosx-10.10-intel/egg/gc3libs/core.py", line 1281, in progress
    "No resources available for running jobs.")
NoResources: No resources available for running jobs.

So, the shellcmd is now correctly disabled as a resource, but the 
'select_resource' still works?

Original comment by kenneth....@gmail.com on 3 Jul 2015 at 9:11

GoogleCodeExporter commented 9 years ago
s/setting/getting/ :)

Original comment by kenneth....@gmail.com on 3 Jul 2015 at 9:12

GoogleCodeExporter commented 9 years ago
By the time `select_resource` is called, the Shellcmd backend does not yet know 
that GNU time is not available.

I think that the `NoResources` exception in `Engine.progress` is the best that 
can be done now:  checking for resource sanity on initialization would prevent 
`gservers` fromn displaying useful output when a resource is not reachable or 
badly configured.

Original comment by riccardo.murri@gmail.com on 3 Jul 2015 at 10:04

GoogleCodeExporter commented 9 years ago
good enough for me, the error message is quite clear

Original comment by kenneth....@gmail.com on 3 Jul 2015 at 10:19

GoogleCodeExporter commented 9 years ago

Original comment by riccardo.murri@gmail.com on 3 Jul 2015 at 8:52