LogicReinc / LogicReinc.BlendFarm

A stand-alone Blender Network Renderer
GNU General Public License v3.0
441 stars 38 forks source link

Cannot start render on a non local node #49

Closed sprogiman closed 1 year ago

sprogiman commented 1 year ago

I have two computers: Blender work is done locally on PC1 Windows 11 21H2 Ryzen 9 5900x Nvidia GTX 1080 32GB DDR4

Want to add my other computer as a node to help out with rendering PC2 Windows 11 22H2 Ryzen 5 3600x Nvidia RTX 3080 16GB DDR4

Note: client and server using the latest 1.1.3 Versions

So I download and run the Client on PC1, Select Blender 3.0.1, since the work is done on 3.0.1 for maximum compatibility and select a .blender file, loads up just nice.

Then I start the Server on PC2, seems to start fine no errors or anything

On PC1 I add the PC2 node, PC2 downloads Blender 3.0.1 just fine On PC1 then I hit "Sync all", that works fine. I can see on PC2 "BlenderFiles" folder, that .blender file has synced over just fine.

PC1 reports as Localhost:15000 (on the LAN as 192.168.1.161) PC2 reports as 192.168.1.217:15000

Checkboxes are set for the Default "Use Automatic Performance" and "Workaround"

When I hit render, PC1 starts the render with no problems while PC2 loses connection multiple times and stops an error.

On PC1 client I can see PC2 with these two errors: 1) Render Fail: Failed to recover too many times, connection too unstable 2) Cannot access a disposed object. Object name: "System.Net.Sockets.TcpClient".

On PC2 server console window:

IP Addresses of this Server: Host Address #1: 192.168.1.217 Port: 15000 Cleaning up old sessions.. Server Started Received checkProtocol [58] from 192.168.1.161:62125 Received computerInfo [42] from 192.168.1.161:62125 Received prepare [60] from 192.168.1.161:62125 Downloading blender-3.0.1... Extracting blender-3.0.1... blender-3.0.1 ready Received sync [95] from 192.168.1.161:62125 Received syncUpload [3149912] from 192.168.1.161:62125 Received syncComplete [83] from 192.168.1.161:62125 Received checkSync [91] from 192.168.1.161:62125 Received checkSync [91] from 192.168.1.161:62125 Received isBusy [42] from 192.168.1.161:62125 Received render [308] from 192.168.1.161:62125 TCP listening exception: Failed to parse RenderRequest due to:Decimal constructor requires an array or span of four valid decimal bytes.

I have tried multiple versions of the software, tried to switch the computers around and clicking check boxes and nothing seems to help, always the same thing. I noticed that both PC's report the same ports, so on PC1 client I edited the port in the "ServerSettings" file to 15001 and that did also not help.

During initial testing both computers have disabled firewalls, they are connected locally with a Wired Ethernet cable that goes to a Switch, which then goes to our router wired correctly with a stable 1GBit connection on the LAN.

PS. I am not trying to render anything special, just currently trying to render the famous bmw car benchmark render

Any ideas?

Thanks.

LogicReinc commented 1 year ago

Hey @sprogiman , Thanks for the detailed report, that saves a lot of time figuring out whats wrong. But the exception makes it pretty clear something goes wrong in serialization (not unstable connection), specifically a decimal value being expected somewhere but not found. I'll see if I can test it this weekend. If I fail to reproduce it I may ask for some more information then.

So far I suspect it it has nothing to do with your setup (hopefully) and should be a relatively easy fix if I can reproduce it.

If you do want to provide more information ahead of time, you could give me screenshots of your render settings (the settings found in the tabs in bottom right), or type them out. Then I know for sure what configuration you use.

A possible workaround that comes to mind is trying to use a different render strategy. I assume SplitHorizontal is used (and is recommended) but SplitChunked might resolve it temporarily as it uses a different method for render request.

sprogiman commented 1 year ago

Sure, here are some photos (sorry for the phone camera for PC1) of the settings and attached client logs from PC1 that is running "SplitHorizontal" and just made a log for PC2 running in "SplitChunked" PC2 SplitChunked Client Log.txt [PC1 SplitHorizontal client log.txt](https://github.com/LogicReinc/LogicReinc.BlendFarm/files/10238756/PC1.SplitHorizontal.client.log.txt PC2 PC2 2 PC1 PC1 2 )

sprogiman commented 1 year ago

Hope github supports it, but here is a screen recording of PC2 in action with all default settings https://user-images.githubusercontent.com/101104424/207914496-07108bd0-c4dc-4a2d-aa29-d2fbed3a2bf9.mp4

LogicReinc commented 1 year ago

@sprogiman I was unable to spend time on it this weekend, Likely next week I have more time as my workload slows down during Christmas. Hopefully I can make some time for this then.

sprogiman commented 1 year ago

@LogicReinc Hey, no problem. Not in a rush here especially during the holiday season. Will wait for updates in the near future.

LogicReinc commented 1 year ago

@sprogiman I'm currently working on a new build for BlendFarm, and I was also able to reproduce your issue. It appears your server (non-local machine) is running a version of the software that is incompatible with the host version. Perhaps this means that release was broken, either way, when updating the server software it appears to be fine. So if you update bother your server and client to the new version when it releases it should work as expected.

Normally you should never end in that scenario, as the server and client check if their versions match, but maybe I forgot to increment the version.

Ill close this issue when I release the next version, if it still occurs after that feel free to re-open.

LogicReinc commented 1 year ago

Should be fixed when updating with 1.1.4, otherwise reopen