AcademySoftwareFoundation / OpenCue

A render management system you can deploy for visual effects and animation productions.
https://www.opencue.io
Apache License 2.0
830 stars 202 forks source link

Dependency issues #192

Closed donalm closed 5 years ago

donalm commented 5 years ago

Is there a way to submit jobs with dependencies between layers currently? I've been unable to submit a job with dependencies between layers, either via the cuesubmit GUI or using the outline Python API.

Here's one attempt:

import outline

ol = outline.Outline("Ariana", user="donalm", frame_range="1-10", shot="TestShot", show="testing")

command=["/bin/true", "-F", "{frameToken}"]
layer_0 = outline.Layer("Birgit", env={}, service="shell", chunk=1, command=command, frame_range="1-10")
ol.add_layer(layer_0)

layer_1 = outline.Layer("Christie", env={}, service="shell", chunk=1, command=command, frame_range="1-10")
layer_1.depend_on(layer_0, outline.depend.DependType.FrameByFrame)
ol.add_layer(layer_1)

launched = outline.cuerun.launch(ol, use_pycuerun=False)

I've experimented with other DependType values, but with similar results.

On the server side I get this exception:

2019-02-08 13:42:08,402 INFO grpc-default-executor-185 com.imageworks.spcue.service.JobSpec - primary service: shell birgit
2019-02-08 13:42:08,403 INFO grpc-default-executor-185 com.imageworks.spcue.service.JobSpec - primary service: shell christie
2019-02-08 13:42:08,403 ERROR grpc-default-executor-185 io.grpc.internal.SerializingExecutor - Exception while executing runnable io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed@16d15e5c
java.lang.IllegalArgumentException: No enum constant com.imageworks.spcue.grpc.depend.DependType.FRAMEBYFRAME
    at java.lang.Enum.valueOf(Enum.java:238)
    at com.imageworks.spcue.grpc.depend.DependType.valueOf(DependType.java:15)
    at com.imageworks.spcue.service.JobSpec.handleDependTag(JobSpec.java:714)
    at com.imageworks.spcue.service.JobSpec.handleDependsTags(JobSpec.java:241)
    at com.imageworks.spcue.service.JobSpec.parse(JobSpec.java:817)
    at com.imageworks.spcue.service.JobLauncher.parse(JobLauncher.java:71)
    at com.imageworks.spcue.servant.ManageJob.launchSpecAndWait(ManageJob.java:248)
    at com.imageworks.spcue.grpc.job.JobInterfaceGrpc$MethodHandlers.invoke(JobInterfaceGrpc.java:2694)
    at io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:171)
    at io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:283)
    at io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:707)
    at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
    at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

My Python script dies noisily:

Traceback (most recent call last):
  File "/mnt/software/dev/dmcmullan/OpenCue/venv/lib/python2.7/site-packages/opencue/util.py", line 38, in _decorator
    return grpcFunc(*args, **kwargs)
  File "/mnt/software/dev/dmcmullan/OpenCue/venv/lib/python2.7/site-packages/opencue/api.py", line 320, in launchSpecAndWait
    job_pb2.JobLaunchSpecAndWaitRequest(spec=spec), timeout=Cuebot.Timeout).jobs
  File "/mnt/software/dev/dmcmullan/OpenCue/venv/lib/python2.7/site-packages/grpc/_channel.py", line 533, in __call__
    return _end_unary_response_blocking(state, call, False, None)
  File "/mnt/software/dev/dmcmullan/OpenCue/venv/lib/python2.7/site-packages/grpc/_channel.py", line 467, in _end_unary_response_blocking
    raise _Rendezvous(state, None, None, deadline)
_Rendezvous: <_Rendezvous of RPC that terminated with:
    status = StatusCode.UNKNOWN
    details = ""
    debug_error_string = "{"created":"@1549633328.404694678","description":"Error received from peer","file":"src/core/lib/surface/call.cc","file_line":1017,"grpc_message":"","grpc_status":2}"
>

Traceback (most recent call last):
  File "opencue_test.py", line 21, in <module>
    launched = outline.cuerun.launch(ol, use_pycuerun=False)
  File "/mnt/software/dev/dmcmullan/OpenCue/pyoutline-0.1.66-all/build/lib/outline/cuerun.py", line 94, in launch
    return launcher.launch(use_pycuerun)
  File "/mnt/software/dev/dmcmullan/OpenCue/pyoutline-0.1.66-all/build/lib/outline/cuerun.py", line 209, in launch
    return self.__get_backend_module().launch(self, use_pycuerun=use_pycuerun)
  File "/mnt/software/dev/dmcmullan/OpenCue/pyoutline-0.1.66-all/build/lib/outline/backend/cue.py", line 105, in launch
    jobs = opencue.api.launchSpecAndWait(launcher.serialize(use_pycuerun=use_pycuerun))
  File "/mnt/software/dev/dmcmullan/OpenCue/venv/lib/python2.7/site-packages/opencue/util.py", line 56, in _decorator
    .format(code=code, details=details))
opencue.exception.CueException: Encountered a server error. StatusCode.UNKNOWN : No details found. Check server logs.

Thanks

DJM

gregdenton commented 5 years ago

Thanks for bringing this to our attention. And thanks for the detailed issue report! Your outline script looks like it should be working and I'm able to reproduce the issue here as well. It looks like the xml output from PyOutline does not have the correct keys for the dependency enums. "FRAMEBYFRAME" should be "FRAME_BY_FRAME".

I'm working on a fix for this now and will get back to you shortly.

Thanks, Greg

On Fri, Feb 8, 2019 at 5:45 AM Dónal McMullan notifications@github.com wrote:

Is there a way to submit jobs with dependencies between layers currently? I've been unable to submit a job with dependencies between layers, either via the cuesubmit GUI or using the outline Python API.

Here's one attempt:

import outline

ol = outline.Outline("Ariana", user="donalm", frame_range="1-10", shot="TestShot", show="testing")

command=["/bin/true", "-F", "{frameToken}"] layer_0 = outline.Layer("Birgit", env={}, service="shell", chunk=1, command=command, frame_range="1-10") ol.add_layer(layer_0)

layer_1 = outline.Layer("Christie", env={}, service="shell", chunk=1, command=command, frame_range="1-10") layer_1.depend_on(layer_0, outline.depend.DependType.FrameByFrame) ol.add_layer(layer_1)

launched = outline.cuerun.launch(ol, use_pycuerun=False)

I've experimented with other DependType values, but with similar results.

On the server side I get this exception:

2019-02-08 13:42:08,402 INFO grpc-default-executor-185 com.imageworks.spcue.service.JobSpec - primary service: shell birgit 2019-02-08 13:42:08,403 INFO grpc-default-executor-185 com.imageworks.spcue.service.JobSpec - primary service: shell christie 2019-02-08 13:42:08,403 ERROR grpc-default-executor-185 io.grpc.internal.SerializingExecutor - Exception while executing runnable io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed@16d15e5c java.lang.IllegalArgumentException: No enum constant com.imageworks.spcue.grpc.depend.DependType.FRAMEBYFRAME at java.lang.Enum.valueOf(Enum.java:238) at com.imageworks.spcue.grpc.depend.DependType.valueOf(DependType.java:15) at com.imageworks.spcue.service.JobSpec.handleDependTag(JobSpec.java:714) at com.imageworks.spcue.service.JobSpec.handleDependsTags(JobSpec.java:241) at com.imageworks.spcue.service.JobSpec.parse(JobSpec.java:817) at com.imageworks.spcue.service.JobLauncher.parse(JobLauncher.java:71) at com.imageworks.spcue.servant.ManageJob.launchSpecAndWait(ManageJob.java:248) at com.imageworks.spcue.grpc.job.JobInterfaceGrpc$MethodHandlers.invoke(JobInterfaceGrpc.java:2694) at io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:171) at io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:283) at io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:707) at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)

My Python script dies noisily:

Traceback (most recent call last): File "/mnt/software/dev/dmcmullan/OpenCue/venv/lib/python2.7/site-packages/opencue/util.py", line 38, in _decorator return grpcFunc(*args, **kwargs) File "/mnt/software/dev/dmcmullan/OpenCue/venv/lib/python2.7/site-packages/opencue/api.py", line 320, in launchSpecAndWait job_pb2.JobLaunchSpecAndWaitRequest(spec=spec), timeout=Cuebot.Timeout).jobs File "/mnt/software/dev/dmcmullan/OpenCue/venv/lib/python2.7/site-packages/grpc/_channel.py", line 533, in call return _end_unary_response_blocking(state, call, False, None) File "/mnt/software/dev/dmcmullan/OpenCue/venv/lib/python2.7/site-packages/grpc/_channel.py", line 467, in _end_unary_response_blocking raise _Rendezvous(state, None, None, deadline) _Rendezvous: <_Rendezvous of RPC that terminated with: status = StatusCode.UNKNOWN details = "" debug_error_string = "{"created":"@1549633328.404694678","description":"Error received from peer","file":"src/core/lib/surface/call.cc","file_line":1017,"grpc_message":"","grpc_status":2}"

Traceback (most recent call last): File "opencue_test.py", line 21, in launched = outline.cuerun.launch(ol, use_pycuerun=False) File "/mnt/software/dev/dmcmullan/OpenCue/pyoutline-0.1.66-all/build/lib/outline/cuerun.py", line 94, in launch return launcher.launch(use_pycuerun) File "/mnt/software/dev/dmcmullan/OpenCue/pyoutline-0.1.66-all/build/lib/outline/cuerun.py", line 209, in launch return self.__get_backend_module().launch(self, use_pycuerun=use_pycuerun) File "/mnt/software/dev/dmcmullan/OpenCue/pyoutline-0.1.66-all/build/lib/outline/backend/cue.py", line 105, in launch jobs = opencue.api.launchSpecAndWait(launcher.serialize(use_pycuerun=use_pycuerun)) File "/mnt/software/dev/dmcmullan/OpenCue/venv/lib/python2.7/site-packages/opencue/util.py", line 56, in _decorator .format(code=code, details=details)) opencue.exception.CueException: Encountered a server error. StatusCode.UNKNOWN : No details found. Check server logs.

Thanks

DJM

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/imageworks/OpenCue/issues/192, or mute the thread https://github.com/notifications/unsubscribe-auth/AC1AaKF5PpB3CoXwVaCL30-1qlKeK0twks5vLX9ygaJpZM4avvGZ .

bcipriano commented 5 years ago

Also sounds like an example of https://github.com/imageworks/OpenCue/issues/59 - gRPC should bubble up that Cuebot error rather than die with a generic UNKNOWN.

donalm commented 5 years ago

Thank you both for looking into this.

gregdenton commented 5 years ago

@bcipriano I've added a generic catch around the submission service calls that will at least pass the server error messages to the client as an INTERNAL error. This should help but we should try to implement a more generic catch all as part of #59.

gregdenton commented 5 years ago

Fixes have been pushed to the master branch. Updated components include:

donalm commented 5 years ago

This works for me now - thanks for turning this around so quickly.

lgeertsen commented 5 years ago

Hi, I'm trying to launch the script from @donalm , but I get the following error:

Traceback (most recent call last):
  File "opencue-scripts/test_donalm.py", line 13, in <module>
    launched = outline.cuerun.launch(ol, use_pycuerun=False)
  File "/datas/geerstenl/venv/lib/python2.7/site-packages/pyoutline-0.2.31-py2.7.egg/outline/cuerun.py", line 99, in launch
    return launcher.launch(use_pycuerun)
  File "/datas/geerstenl/venv/lib/python2.7/site-packages/pyoutline-0.2.31-py2.7.egg/outline/cuerun.py", line 214, in launch
    return self.__get_backend_module().launch(self, use_pycuerun=use_pycuerun)
  File "/datas/geerstenl/venv/lib/python2.7/site-packages/pyoutline-0.2.31-py2.7.egg/outline/backend/cue.py", line 116, in launch
    jobs = opencue.api.launchSpecAndWait(launcher.serialize(use_pycuerun=use_pycuerun))
  File "/datas/geerstenl/venv/lib/python2.7/site-packages/pycue-0.2.31-py2.7.egg/opencue/util.py", line 59, in _decorator
    "Server caught an internal exception. {}".format(details)))
  File "/datas/geerstenl/venv/lib/python2.7/site-packages/pycue-0.2.31-py2.7.egg/opencue/util.py", line 44, in _decorator
    return grpcFunc(*args, **kwargs)
  File "/datas/geerstenl/venv/lib/python2.7/site-packages/pycue-0.2.31-py2.7.egg/opencue/api.py", line 333, in launchSpecAndWait
    job_pb2.JobLaunchSpecAndWaitRequest(spec=spec), timeout=Cuebot.Timeout).jobs
  File "/datas/geerstenl/venv/lib/python2.7/site-packages/grpc/_channel.py", line 533, in __call__
    return _end_unary_response_blocking(state, call, False, None)
  File "/datas/geerstenl/venv/lib/python2.7/site-packages/grpc/_channel.py", line 467, in _end_unary_response_blocking
    raise _Rendezvous(state, None, None, deadline)
opencue.exception.CueInternalErrorException: Server caught an internal exception. Failed to launch and add job: Failed to parse job spec XML, org.jdom.input.JDOMParseException: Error on line 1: Attribute value "b'False'" of type NMTOKEN must be a name token.

Any idea what the problem could be?

donalm commented 5 years ago

Hey @lgeertsen - could you be seeing this other bug?

lgeertsen commented 5 years ago

@donalm Yes that was the error. I used the release package, that's why i still had the error. I've build now from the latest git version and it works :smile:

donalm commented 5 years ago

Nice! Thanks for following up.

bcipriano commented 5 years ago

Thanks all. We'll make sure to cut a new release soon.