dblanque / flamenco-compositor-script

Scriptset to enable compositing in the Blender Flamenco Network Renderer
GNU General Public License v3.0
0 stars 2 forks source link

Errors in worker threads, can't find paths #5

Open captainstarfish opened 5 months ago

captainstarfish commented 5 months ago

Cloned latest version, installed per directions.

Blender 4.1.1 installed, latest flamenco via separate docker containers for the manager and the worker. These work fine (other than ignoring the compositing nodes) without these plugins.

I've put the startup_script.py in the jobs directory of the shared drive, and the js file is in the scripts directory of the manager container. It appears in the Blender add-on when I hit "fetch job types" so I'm assuming it's in the right place.

flamenco-manager.yaml looks like this. Apologies for the formatting, the code tags absolutely munt the data. _meta: version: 3 manager_name: Flamenco Manager database: flamenco-manager.sqlite listen: :8080 autodiscoverable: true local_manager_storage_path: ./flamenco-manager-storage shared_storage_path: /share shaman: enabled: false garbageCollect: period: 24h0m0s maxAge: 744h0m0s extraCheckoutPaths: [] task_timeout: 10m0s worker_timeout: 1m0s blocklist_threshold: 3 task_fail_after_softfail_count: 3 variables: blender: values:

Yes, /share is a terrible place to put things but it's what I've inherited and one battle at a time...

But it almost looks like the job is trying to tack the path for the clientStoragePath output onto the current working directory rather than the root

blender -b -y -P '{clientStoragePath}/{jobSubPath}/startup_script.py' '{clientStoragePath}/{jobSubPath}/2024-05-06-164636.275479-DryrClean/DryrClean2.flamenco.blend' -noaudio --render-output '{clientStoragePath}/{renderSubPath}/2024-05-06-164636.275479-DryrClean/######' --render-format PNG --render-frame 1 --python-expr 'import bpy; bcs = bpy.context.scene; bcs.render.use_compositing = True; bcs.use_nodes = True' -- --custom-script --device-type '{deviceType}'

pid=82 > Blender 4.1.1 (hash e1743a0317bc built 2024-04-15 23:47:45) pid=82 > Read prefs: "/root/.config/blender/4.1/config/userpref.blend" pid=82 > OSError: Python file "/code/flamenco/{clientStoragePath}/{jobSubPath}/startup_script.py" could not be opened: No such file or directory pid=82 > Error: Cannot read file "/code/flamenco/{clientStoragePath}/{jobSubPath}/2024-05-06-164636.275479-DryrClean/DryrClean2.flamenco.blend": No such file or directory pid=82 > pid=82 > Blender quit Failed: command exited abnormally with code 1 2024-05-06T08:46:38Z Task failed by 1 worker, Manager will mark it as soft failure. 2 more failures will cause hard failure. 2024-05-06T08:46:38Z task changed status active -> soft-failed

dblanque commented 5 months ago

Hey man! Are you using it with or without SHAMAN?

dblanque commented 5 months ago

Could you also show me an output of the tree command on the shared directory path, just to see the directory structure? (You can install it with apt-get install)

dblanque commented 5 months ago

The startup_script.py needs to be in your Shared Storage root, here's an example. image

image

captainstarfish commented 5 months ago

Hey - thanks for the quick response!

Using without SHAMAN. In the yaml: shaman: enabled: false

I started with the script in the root level of the share, but for some reason the logs made me think it was looking for it in the jobs subdirectory.

I moved it back and tried again. Render job failed with log: blender -b -y -P '{clientStoragePath}/{jobSubPath}/startup_script.py' '{clientStoragePath}/{jobSubPath}/2024-05-09-080451.569866-testcompnodes/untitled.flamenco.blend' -noaudio --render-output '{clientStoragePath}/{renderSubPath}/2024-05-09-080451.569866-testcompnodes/######' --render-format PNG --render-frame 1 --python-expr 'import bpy; bcs = bpy.context.scene; bcs.render.use_compositing = True; bcs.use_nodes = True' -- --custom-script --device-type '{deviceType}'

pid=666 > Blender 4.1.1 (hash e1743a0317bc built 2024-04-15 23:47:45) pid=666 > Read prefs: "/root/.config/blender/4.1/config/userpref.blend" pid=666 > OSError: Python file "/code/flamenco/{clientStoragePath}/{jobSubPath}/startup_script.py" could not be opened: No such file or directory pid=666 > Error: Cannot read file "/code/flamenco/{clientStoragePath}/{jobSubPath}/2024-05-09-080451.569866-testcompnodes/untitled.flamenco.blend": No such file or directory pid=666 > pid=666 > Blender quit Failed: command exited abnormally with code 1 2024-05-09T00:04:54Z Task failed by worker cl4p-trp (d692f23c-5221-4e99-aeda-a7a65c1a0897), Manager will fail the entire job as there are no more workers left for tasks of type "blender". 2024-05-09T00:04:54Z task changed status active -> failed

Tree in the share looks like: image

Thanks for looking into it!

dblanque commented 5 months ago

From what I can see it's also unable to find the blender file itself, notice this line:

pid=666 > Error: Cannot read file "/code/flamenco/{clientStoragePath}/{jobSubPath}/2024-05-09-080451.569866-testcompnodes/untitled.flamenco.blend": No such file or directory

Ah, I think the jobs folder is named differently when not using shaman, try changing the jobSubPath to job-storage instead of the default jobs in your flamenco-manager.yaml, and the render subpath to renders.

You might have to re-fetch and re-submit the job every time you do those kinds of changes.

Sorry if these might seem like silly questions but... Do you have the correct mountpoint for linux clients setup in the flamenco-manager.yaml? Also, have you checked that they can effectively read the shared storage files?

It... seems like Flamenco is not replacing the path variables in your Blender command,

dblanque commented 5 months ago

You don't have the correct value for jobSubPath set in your flamenco-manager.yaml. It's currently set at job_storage instead of job-storage.

captainstarfish commented 5 months ago

Ah, I think the jobs folder is named differently when not using shaman, try changing the jobSubPath to job-storage instead of the default jobs in your flamenco-manager.yaml, and the render subpath to renders.

You don't have the correct value for jobSubPath set in your flamenco-manager.yaml. I'm so sorry to have wasted your time with such a stupid mistake. Fixed, unfortunately issue remains. image

Sorry if these might seem like silly questions but... As you can see from my amateur hour derp, silly questions are probably appropriate!

Regarding mountpoints: Flamenco manager: docker exec -it flamenco-manager /bin/bash root@93e9e4ed06dc:/code/flamenco# ls /share Service flamenco-addon.zip job-storage renders startup_script.py

Flamenco worker: docker exec -it flamenco-worker /bin/bash root@b337607ae6a8:/code/flamenco# ls /share Service flamenco-addon.zip job-storage renders startup_script.py

And they're visible the same from the windows shares. Also the job file is readable. So I think yes, mounts are good.

The braces around the variable names in the paths certainly make it look like substitutions aren't happening. Any thoughts?

dblanque commented 5 months ago

Well I tested it with version 4.0 of Blender, but I don't think the Flamenco Manager version changed since I tested it, so I'm thinking it's either something to do with having Shaman on/off, using it in a Docker container, or I don't know what else.

The strange thing is the variable substitution used is the same for the path as the blender executable and arguments (feel free to check the JavaScript code). So if one doesn't work, the other one shouldn't either.


for (let chunk of chunks) {
    const task = author.Task(`render-${chunk}`, "blender");
    const command = author.Command("blender-render", {
        exe: "{blender}",
        exeArgs: "{blenderArgs}",
        argsBefore: [
            "-P", path.join("{clientStoragePath}", "{jobSubPath}", "startup_script.py"),
        ],
        blendfile: path.join("{clientStoragePath}", "{jobSubPath}", new_job_name, blendfile_name),
        args: [
            "-noaudio",
            "--render-output", path.join("{clientStoragePath}", "{renderSubPath}", new_job_name, path.basename(renderOutput)),
            // ▼ Original Render Output Argument ▼
            // "--render-output", path.join(renderDir, path.basename(renderOutput)),
            "--render-format", settings.format,
            "--render-frame", chunk.replace("-", ".."), // Convert to Blender frame range notation.
            "--python-expr", "import bpy; bcs = bpy.context.scene; bcs.render.use_compositing = True; bcs.use_nodes = True",
            "--", // ◄ Blender ignores every argument after this line
            "--custom-script",
            "--device-type", "{deviceType}",
        ],
    });
    task.addCommand(command);
    renderTasks.push(task);
}
return renderTasks;

Also beware of the variables in your path, is your shared folder effectively in /code/flamenco/share/job-storage/?

I find it strange that you have the variables defined and they're not being replaced. Have you checked that the Flamenco Manager is effectively using the correct YAML config file?

captainstarfish commented 5 months ago

Have you checked that the Flamenco Manager is effectively using the correct YAML config file?

I switched to CPU in the flamenco-manager.yaml and watched the next job absolutely choke the CPU and go super slow. Back to CUDA (turns out that card doesn't support Optix) and it's running fast again. So I'm thinking it's picking up the config file just fine.

Also beware of the variables in your path, is your shared folder effectively in /code/flamenco/share/job-storage/?

No, we're mounting the shared volume to /share so the path would be /share/job-storage/

Does it need to be under /code/flamenco?

------ Original Message ------ From "Dylan Blanqué" @.> To "dblanque/flamenco-compositor-script" @.> Cc "Simon Lockwood" @.>; "Author" @.> Date 13/05/2024 11:19:25 PM Subject Re: [dblanque/flamenco-compositor-script] Errors in worker threads, can't find paths (Issue #5)

Well I tested it with version 4.0 of Blender, but I don't think the Flamenco Manager version changed since I tested it, so I'm thinking it's either something to do with having Shaman on/off, using it in a Docker container, or I don't know what else.

The strange thing is the variable substitution used is the same for the path as the blender executable and arguments (feel free to check the JavaScript code). So if one doesn't work, the other one shouldn't either.

for(letchunkofchunks){consttask=author.Task(render-${chunk},"blender");constcommand=author.Command("blender-render",{exe: "{blender}",exeArgs: "{blenderArgs}",argsBefore: ["-P",path.join("{clientStoragePath}","{jobSubPath}","startup_script.py"),],blendfile: path.join("{clientStoragePath}","{jobSubPath}",new_job_name,blendfile_name),args: ["-noaudio","--render-output",path.join("{clientStoragePath}","{renderSubPath}",new_job_name,path.basename(renderOutput)),// ▼ Original Render Output Argument ▼// "--render-output", path.join(renderDir, path.basename(renderOutput)),"--render-format",settings.format,"--render-frame",chunk.replace("-",".."),// Convert to Blender frame range notation."--python-expr","import bpy; bcs = bpy.context.scene; bcs.render.use_compositing = True; bcs.use_nodes = True","--",// ◄ Blender ignores every argument after this line"--custom-script","--device-type","{deviceType}",],});task.addCommand(command);renderTasks.push(task);}returnrenderTasks; Also beware of the variables in your path, is your shared folder effectively in /code/flamenco/share/job-storage/?

I find it strange that you have the variables defined and they're not being replaced. Have you checked that the Flamenco Manager is effectively using the correct YAML config file?

— Reply to this email directly, view it on GitHub https://github.com/dblanque/flamenco-compositor-script/issues/5#issuecomment-2107966395, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFFRI5HXWQN3DA764ZEGDULZCDKX3AVCNFSM6AAAAABHIUVPBKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBXHE3DMMZZGU. You are receiving this because you authored the thread.Message ID: @.***>

dblanque commented 5 months ago

I switched to CPU in the flamenco-manager.yaml and watched the next job absolutely choke the CPU and go super slow. Back to CUDA (turns out that card doesn't support Optix) and it's running fast again. So I'm thinking it's picking up the config file just fine.

Wait so you have worker nodes where this works and others where it doesn't?

Does it need to be under /code/flamenco?

Well, I mentioned it because your Flamenco Workers are picking up the path: /code/flamenco/{clientStoragePath}/{jobSubPath}/2024-05-09-080451.569866-testcompnodes/untitled.flamenco.blend from somewhere.

It should not have /code/flamenco/ there. This feels like a Flamenco problem with path/variable replacement when not using Shaman but I wouldn't want to annoy Sybren (the creator of Flamenco) if it was unrelated...

captainstarfish commented 5 months ago

No, only one worker node. Sorry, wasn’t clear. The config works with the default simple render task type that ships with flamenco. The optix/cuda/cpu thing was just making sure the manager was picking up that yaml.With the path it’s like it’s ignoring the leading slash in front of the path and just smooshing it onto the current working directory.Don’t waste any more time on this, Dylan.If we have time at work in a couple weeks I’ll have another poke at it but in the meantime we’re getting acceptable renders from the default task so I’m not in a hurry.What I’ll probably do is switch on shaman and see if things start working.Cheers!On 14 May 2024, at 21:03, Dylan Blanqué @.***> wrote:

I switched to CPU in the flamenco-manager.yaml and watched the next job absolutely choke the CPU and go super slow. Back to CUDA (turns out that card doesn't support Optix) and it's running fast again. So I'm thinking it's picking up the config file just fine.

Wait so you have worker nodes where this works and others where it doesn't?

Does it need to be under /code/flamenco?

Well, I mentioned it because your Flamenco Workers are picking up the path: /code/flamenco/{clientStoragePath}/{jobSubPath}/2024-05-09-080451.569866-testcompnodes/untitled.flamenco.blend from somewhere. It should not have /code/flamenco/ there. This feels like a Flamenco problem with path/variable replacement when not using Shaman but I wouldn't want to annoy Sybren (the creator of Flamenco) if it was unrelated...

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: @.***>

dblanque commented 5 months ago

No worries, I'll see if I can test an instance without Shaman when I have time and get back to you on it. I'm pretty sure that the path being added there is what's causing the problem with the variable substitutions.

dblanque commented 4 months ago

Alright, so I've been testing a bit today from my Windows machine onto a Linux Samba NAS and it definitely breaks with Shaman off. image

If you need compositing I'd recommend you try enabling Shaman and wide links (check the Flamenco documentation) on your NAS and test it out that way. As for why it's not working without Shaman, I still have to test out and debug, I haven't checked out if it's properly replacing the path variables on Linux without Shaman either, but it probably isn't.

I might have to report to Sybren on this.

captainstarfish commented 4 months ago

Blasted Windows strikes again...

Thanks Dylan, Shaman was on my to-do list for today!

------ Original Message ------ From "Dylan Blanqué" @.> To "dblanque/flamenco-compositor-script" @.> Cc "Simon Lockwood" @.>; "Author" @.> Date 31/05/2024 7:47:43 AM Subject Re: [dblanque/flamenco-compositor-script] Errors in worker threads, can't find paths (Issue #5)

Alright, so I've been testing a bit today from my Windows machine onto a Linux Samba NAS and it definitely breaks with Shaman off. image.png (view on web) https://github.com/dblanque/flamenco-compositor-script/assets/68660667/101d2754-f8bb-4119-b88e-4dcba919d9bc

If you need compositing I'd recommend you try enabling Shaman and wide links (check the Flamenco documentation) on your NAS and test it out that way. As for why it's not working without Shaman, I still have to test out and debug, I haven't checked out if it's properly replacing the path variables on Linux without Shaman either, but it probably isn't.

I might have to report to Sybren on this.

— Reply to this email directly, view it on GitHub https://github.com/dblanque/flamenco-compositor-script/issues/5#issuecomment-2141012303, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFFRI5HTWYCUWZ53K32SRUDZE63B7AVCNFSM6AAAAABHIUVPBKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNBRGAYTEMZQGM. You are receiving this because you authored the thread.Message ID: @.***>