GafferHQ / gaffer

Gaffer is a node-based application for lookdev, lighting and automation
http://www.gafferhq.org
BSD 3-Clause "New" or "Revised" License
950 stars 205 forks source link

Render Control UI has issues starting in freshly opened scene #3814

Open andrewkaufman opened 4 years ago

andrewkaufman commented 4 years ago

Version: Gaffer 0.57.5.0-linux Third-party tools: Arnold Third-party modules: Many, including a Box that wraps InteracitveRender and Catalogue

Description

I've just started using the Render Control UI to start/stop renders, and normally it is working fine. But I have at least one gfr where it fails to start.

Steps to reproduce

  1. Open my gfr file (I can send a path privately, but hopefully its reproducible without)
  2. Pin the Viewer to the IPR/Catalogue Box
  3. Push play on the Render Control UI

Nothing happens in the UI, in the terminal you will find display driver server issues (see Debug log).

If I then go to the NodeEditor, I'll need to stop the render and start again. From here on out the Render Control UI works as expected.

Debug log

Click to Expand

``` ERROR : ieOutputDriver:driverOpen : Could not connect to remote display driver server : Connection refused ERROR : ieOutputDriver:driverOpen : Could not connect to remote display driver server : Connection refused ERROR : ieOutputDriver:driverOpen : Could not connect to remote display driver server : Connection refused ERROR : ieOutputDriver:driverOpen : Could not connect to remote display driver server : Connection refused ERROR : ieOutputDriver:driverOpen : Could not connect to remote display driver server : Connection refused ```

themissingcow commented 4 years ago

Thanks @andrewkaufman - will try to repro

andrewkaufman commented 4 years ago

Perhaps one important detail is the Catalogue has a few images in it already (all stopped IPRs) from a previous time opening the file.

themissingcow commented 4 years ago

Perhaps one important detail is the Catalogue has a few images in it already (all stopped IPRs) from a previous time opening the file.

I've not been able to recreate this this end have a very similar setup. If you could share a script, that'd be great. The connection refused seems to be the key to me. Do you have multiple gaffers open? Also can you check, in that scene, what ${image:catalogue:port} expands to compared to the catalogues port, eg: root['Catalogue'].displayDriverServer().portNumber()

andrewkaufman commented 4 years ago

I think you're on the right track with the catalogue port. Our Box has a custom UI with custom start/stop buttons. Those buttons call a private method that syncronizes the ports... So I guess I need to hook into the Render Control UI somehow. Do I have access to the button clickedSignals currently? Or perhaps a more specific Render Control API?

themissingcow commented 4 years ago

I think you're on the right track with the catalogue port. Our Box has a custom UI with custom start/stop buttons. Those buttons call a private method that syncronizes the ports... So I guess I need to hook into the Render Control UI somehow. Do I have access to the button clickedSignals currently? Or perhaps a more specific Render Control API?

The Render Control API is the state plug on the InteractiveRender node. All the UI in the viewer does is show the PlugValueWidget for that plug, with a little logic for when to disable/enable it. Are you able to share the code you're running? I can compare to CS then. Seems like its something that we should just support at the gaffer end if it's required to make things work?

andrewkaufman commented 4 years ago

Its a bit complicated to share the full code without getting on VPN, so I'll explain it a bit first. We wrap the IPR+Catalogue and several other ArnoldRender tools into a single Box, with a mode for IPR, Netrender, Local, and Batch Renders. IPR, Netrender, and Local all drive a common displayPort plug on our Box so that they all target the same Catalogue. Here it is being constructed:

        # Store the port to use. We serialize that plug so that the port is
        # always brought in from a file. These ports are invalid (the Catalogue
        # uses a different one), if we're in a gui session and not currenly
        # executing a .gfr file to render an image. In the latter case, the
        # port that was brought in is correct because it allows us to connect
        # back to the instance that kicked off the rendering. For gui sessions,
        # though, we need to use the port set up by the Catalogue. We fix the
        # port when dispatching a rendering in the respective methods.
        self['parameters']['_displayPort'] = Gaffer.IntPlug()
        self['parameters']['_displayPort'].setValue( self['Catalogue'].displayDriverServer().portNumber() )

The comment above makes reference to re-syncronizing the port in GUI sessions, which we do by connecting to the play button's clickedSignal and running a bit of python:

        # Make sure that the host and port that the rendering will be sent to are set up.
        # We need to do this here because different data might have been set before and
        # sending the pixels to that machine/port does not make sense here. It would confuse artists
        # who had to switch machines or pick up a script from somebody else. Also, the new Catalogue 
        # determines a new port whenever it's initialized so we need to update it here accordingly.
        self['parameters']['_displayHostname'].setValue( socket.gethostname() )
        self['parameters']['_displayPort'].setValue( self['Catalogue'].displayDriverServer().portNumber() )

To make things a bit more difficult, the "play button" is not the widget of the ipr state plug, its just a button on our Box. I think this is just because people preferred "Go" and "Stop" to the ⏯️ ⏹️ icons... Its possible current users would be willing to sacrifice that if it makes fixing this issue easier...

johnhaddon commented 4 years ago

I suspect the problem here is the custom port wrangling code, not the render control UI. Can you not just use the standard ${image:catalogue:port} expression that is used in the default outputs?

andrewkaufman commented 4 years ago

I believe you're right, the trouble is with the custom ui code and not the render control ui itself. I'm just wondering how best to hook into the button clicks of the render control ui in order to run the same synchronization steps we do on our custom ui.

The history of our code is a bit confusing, so the best I was able to trace this back to was a PR where Matti tried using the context variable, and you advised against it saying:

I came up with the '${image:catalogue:port}' config mechanism because in Gaffer I needed a way to distribute that value into all Outputs nodes, however they were made. But here we're in total control, so if you wanted to you could omit this plug entirely, and just call Catalogue.displayDriverServer().portNumber() directly, putting the result in the displayPort parameter plug using setValue().

I don't know if there were actually issues experience with the context variable approach or if we changed it purely as a code review thing... At this stage I'd be happiest to not worry about those details and just find the appropriate UI hooks to call our port sync method, as it feels like that should be fairly trivial and definitively fix a known issue, whereas switching approach would need more thorough testing.

johnhaddon commented 4 years ago

I'm just wondering how best to hook into the button clicks of the render control ui in order to run the same synchronization steps we do on our custom ui.

I don't think that is the right place to deal with the problem. I think the problem is that we're trying to deal with this in the UI code at all. The graph is the shared model that all UIs provide a view onto. All synchronisation should take place via the graph and graph changed signals, not by tying different UIs together directly.

I fear I may have led Matti down the wrong path in the first place, and would definitely recommend trying the context variable approach as a first step. The closer you stay to Gaffer's standard mechanisms, the less chance you have of running into problems.

themissingcow commented 4 years ago

@andrewkaufman Managed to reproduce this with vanilla Gaffer. I think we're not always stopping renders when closing scripts (or not closing scripts sometimes). To repro:

  1. Take some scene and start a render.
  2. Do something to dirty the script.
  3. Choose File > Revert to Saved
  4. The RenderControl start button will be available, pressing it will grey it out, with a connection refused message in the terminal.