we are currently trying to make our farm more efficient by running 2 or more tasks per node in parallel. The main issue we are facing right now is how afanasy checks if the render-node has enough RAM left to run the task. in our case the memory needed by renderjobs always increases over time so just checking at the start of a new task is not safe. we have implemented an out-of-memory check in the parser that kills tasks once they exceed their memory limit. (and if there is no more RAM left on the system)
I would suggest checking if the the total system memory minus the memory required by all running tasks on the render-node is bigger then the memory needed by the next task that should be assigned. does this make sense?
totalMemNeededForRunningTasks = 0
for task in render->allTasks:
totalMemNeededForRunningTasks+=task->getNeedMemory()
if (m_data->getNeedMemory() > render->getHostRes().mem_free_mb - totalMemNeededForRunningTasks)
can you turn that pseudo code into real c++ for me and post it here? I could then test this at RISE and see how well it works.
this would turn the neededMemory property for blocks into a "max. memory" which ofc needs to be monitored by afrender (or the parser like we do at RISE already)
Hi Timur,
we are currently trying to make our farm more efficient by running 2 or more tasks per node in parallel. The main issue we are facing right now is how afanasy checks if the render-node has enough RAM left to run the task. in our case the memory needed by renderjobs always increases over time so just checking at the start of a new task is not safe. we have implemented an out-of-memory check in the parser that kills tasks once they exceed their memory limit. (and if there is no more RAM left on the system)
I would suggest checking if the the total system memory minus the memory required by all running tasks on the render-node is bigger then the memory needed by the next task that should be assigned. does this make sense?
I think the easiest way would be to change this line https://github.com/CGRU/cgru/blob/69eb55beaaedaf35996face23bb09373c88f5181/afanasy/src/server/block.cpp#L230 to somehing like in this pseudo code:
can you turn that pseudo code into real c++ for me and post it here? I could then test this at RISE and see how well it works.
this would turn the neededMemory property for blocks into a "max. memory" which ofc needs to be monitored by afrender (or the parser like we do at RISE already)