getavalon / docker

Avalon in a Dockerfile
MIT License
5 stars 2 forks source link

CGWire #15

Closed tokejepsen closed 6 years ago

tokejepsen commented 6 years ago

This is a PR to include CGWire into Avalon's distribution.

tokejepsen commented 6 years ago

There are currently two issues to tackle:


mottosso commented 6 years ago

Whop! There's a lot to condense here. Off the top of my head:

tokejepsen commented 6 years ago

Mongo can be run as a daemon with --fork. It requires to log activity to a file, which is currently /avalon/mongo.log.

mottosso commented 6 years ago

I whipped up a script to sync CGWire -> Avalon.

import os
import gazu
from avalon import io as avalon

# Note: global..

# Note: plain-text..
gazu.log_in("", "default")

print("Logged in..")
projects = []
objects = []

for project in gazu.project.all_projects():
    assets = gazu.asset.all_assets_for_project(project)
    shots = gazu.shot.all_shots_for_project(project)

    for assets, silo in ((assets, "assets"), (shots, "shots")):
        for asset in assets:
                "schema": "avalon-core:asset-2.0",
                "name": asset["name"].replace(" ", ""),  # remove spaces
                "silo": silo,
                "data": {},
                "type": "asset",
                "parent": project["name"],

        "schema": "avalon-core:project-2.0",
        "type": "project",
        "name": project["name"],
        "data": {},
        "parent": None,
        "config": {
            "schema": "avalon-core:config-1.0",
            "apps": [],
            "tasks": [
                {"name": task["name"]}
                for task in gazu.task.all_task_types()
            "template": {

print("%d projects" % len(projects))
print("%d assets" % len(objects))

os.environ["AVALON_PROJECTS"] = r""
os.environ["AVALON_PROJECT"] = "temp"
os.environ["AVALON_ASSET"] = "bruce"
os.environ["AVALON_SILO"] = "assets"
os.environ["AVALON_CONFIG"] = "polly"
os.environ["AVALON_MONGO"] = "mongodb://"

existing_projects = {}
existing_assets = {}
installed_projects = []

print("Fetching Avalon data..")
for project in avalon.projects():
    existing_projects[project["name"]] = project

for asset in avalon.find({"type": "asset"}):
    existing_assets[asset["name"]] = asset

for project in projects:
    if project["name"] in existing_projects:

    print("Installing project: %s" % project["name"])
    os.environ["AVALON_PROJECT"] = project["name"]


for asset in objects:
    if asset["name"] in existing_assets:

    asset["parent"] = avalon.locate([asset["parent"]])
    print("Installing asset: %s" % asset["name"])

Here's what I'm thinking.

  1. Down the line, I expect Avalon to be listening for events happening in CGWire, and synchronise as it happens.
  2. Synchronisation would happen both ways, from changes happening in Avalon -> CGWire and vice versa.
  3. Until then, a simple polling synchronisation should suffice, whereby Avalon polls CGWire for changes at a fixed interval, such as every 10 seconds. The above script is what I expect can run every 10 seconds.

This should help us prove the concept, before going much further.

Writing this, it became rather clear that our is ill devised. It's already showing legacy tendencies from the days before Launcher was made, such as expecting the environment to be fully qualified with project and all before being used. For a rainy day, we could have a look at breaking that dependency and making it into a more generic database browsing utility which it was ultimately made to be.

tokejepsen commented 6 years ago

Very cool.

Would be good to have a centralized api or similar that needs implementing for other project managers, so it easier to know what needs doing. Would the be the place for this?

mottosso commented 6 years ago

Yes, is the gateway to the database. builds on top of that, and is effectively a higher level version of, with an understanding of assets and things.

We should pop up a separate "refactoring PR" about it, but in a nutshell, one of the higher level goals of was to avoid having to explicitly reference the server and project whenever accessing the database from within a host.

That is, I wanted:

from avalon import api
for asset in
  print(asset)  # list assets from current project

As opposed to..

from avalon import api

client = api.Client("mongodb://")
db = client["avalon"][api.current_project()]
for asset in

The cost of the convenience however is more "under the hood" stuff, like having a project and db address set in the environment, upfront, which in retrospect complicates other aspects like what we're trying to do right now.

mottosso commented 6 years ago

One more thing about the synchronisation.

Initially I was thinking that maybe it'd be worth switching Avalon to using the CGWire database entirely; skip the synchronisation step. But having interacted with it, on the surface there are a few problems with that.

  1. There is a ton of assumptions made to our disfavour, primarily the fixed structure of project, episode, sequence and shot. Fixed all the way into the individual function calls, which includes the words themselves; very hard to refactor.
  2. Reading/writing (to Postgres) is surprisingly slow; I counted 40 calls/sec on getting 3 users from it. Compare this to the getting 3 assets from Mongo at 45,000 requests/sec (see below). This is important, because we've been building GUIs to leverage this speed by relying on it being fast, to avoid things like caching, progress bars and timeglasses, and overall being really really fast on any queries which means we can make a lot more complicated queries where necessary.
  3. Finally, switching to CGWire would still involve a synchronisation step with other frameworks like Shotgun and ftrack, so we aren't gaining much anyway.
import timeit
num = 100
dur = timeit.timeit(lambda: avalon.find({"type": "asset"}), number=num)
print("%.3f/sec" % (num / dur))

TLDR; we should stick to Mongo internally.

tokejepsen commented 6 years ago

Currently cant get a working version of this, not even to the point of getting the Kitsu website up. Running into this issue:

Step 30/34 : RUN echo Initialising Zou... &&     /opt/zou/
 ---> Running in f9452a43292f
/bin/sh: 1: /opt/zou/ not found

But the file /opt/zou/ clearly is present and copied just a couple of lines earlier.

tokejepsen commented 6 years ago

Interestingly I had to switch the line-endings on to Unix style from Windows style endings.

I assume this is because I'm developing on Windows and when cloning the repository, Git assume Windows style endings.

tokejepsen commented 6 years ago

First working dockerfile. Woop woop!


Apart from the line endings problem, there is also an issue when creating the default admin user with zou. The email is being checked for validity here, so it can't be the example Don't know how CGWire did this, with that email.

tokejepsen commented 6 years ago

I have managed to remove all but two external files. I didn't remove nginx.conf and supervisord.conf because they are quite long, and I could not figure out a way of creating the files in the Dockerfile.

I'm also not entirely happy about the syntax of creation of the files, because its a lot of echo run commands. Also it can probably be improved for readability.

Lastly I have temporary disabled the creation of the user because of this.

tokejepsen commented 6 years ago

For starters we should either get CGWire out of supervisord, which it is using for running services in the background, or we get Mongo and Samba into supervisord.

It'll probably very easy to get Mongo and Samba in supervisord. Do you mean we'll use supervisor to reduce the entrypoint to supervisord -c /etc/supervisord.conf ?

mottosso commented 6 years ago

It'll probably very easy to get Mongo and Samba in supervisord. Do you mean we'll use supervisor to reduce the entrypoint to supervisord -c /etc/supervisord.conf ?

Yeah, seems reasonable I think.

tokejepsen commented 6 years ago

That is Mongo and Samba running in supervisor. supervisor is actually quite a neat framework.

Dunno what do about the existing external files and the amount of RUN commands.

Should this PR be with the CGWire sync script as well? Imagining this to be run through supervisor.

mottosso commented 6 years ago

Should this PR be with the CGWire sync script as well?

I think we can make that an independent PR. Think we have a few things to work out that doesn't involve getting CGWire up and running.

Imagining this to be run through supervisor.

Yeah, I think a plain Python process running on an infinite while loop should suffice, running some function and sleeping for 10 seconds.

Then we can open up a dialog with Frank about what we need from CGWire in terms of callbacks to get rid of it.

mottosso commented 6 years ago

Dunno what do about the existing external files and the amount of RUN commands.

I'll do a pass over it now, see what I can do.

tokejepsen commented 6 years ago

This is looking good to me. Think we still need to figure out the email issue before we can merge.

mottosso commented 6 years ago

Ok, works!

$ docker run \
  --name avalon \
  -e AVALON_USERNAME=avalon \
  -e \
  -e AVALON_PASSWORD=default \
  -v avalon-db:/data/db \
  --rm -ti \
  -p 445:445 \
  -p 27017:27017 \
  -p 80:80 avalon/docker:0.4

Where the -e are optional.

tokejepsen commented 6 years ago

Nicely done! I'm pretty happy with this. Merge?

mottosso commented 6 years ago

Done! Would you like to give the multi-container approach a try next?

tokejepsen commented 6 years ago

Sure, lets

mottosso commented 6 years ago

Oh no, I just realised something. The email is currently hardcoded into the image, as it's being installed on build, not on run.

Something to keep in mind for the multi approach.

tokejepsen commented 6 years ago

That is a good point. Maybe we should have it as part of the entry points, that it'll make an admin account if an email is passed in?

Even more of a reason for splitting the tracking container from the rest. Going to be very specific behaviour for Ftrack and friends.

tokejepsen commented 6 years ago

The email is currently hardcoded into the image, as it's being installed on build, not on run.

Think the approach will be that will be the default admin user, and from that account users will create other admin accounts with more secure login, then delete the account.