GPUOpen-LibrariesAndSDKs / RadeonProRender-Baikal

MIT License
334 stars 78 forks source link

Be able to share compiled OpenCL binary caches among identical GPUs #187

Open harokyang opened 6 years ago

harokyang commented 6 years ago

This change solved the following issue

  1. When rendering with multiple GPUs/CPUs, the generated inputmap.cl of worker threads are different to each other Because each worker thread use its own instance of default_material in ClwSceneController, which has its own SceneObject id Different SceneObject id means different inputmap id, lead to different inputmap.cl

Change default_material from member variable to static variable solve this issue

  1. If rendering with multiple identical GPUs, they should have share the same kernel binary caches (with the first issue solved) But if there is no cache exists, all the GPUs will try to compile and generate the very same binary cache, at the same time This may cause a huge spike in memory consumption with lots of wasted works

Added a lock mechanism to allow only one worker to compile and generate binary cache, while the others will simply wait for it Any workers with different GPU/CPU model won't be affected and will be able to generate their own version of kernel binaries

harokyang commented 6 years ago

Can somebody give me a hint about the jenkins build report? I have no access

AvKhokhlov commented 6 years ago

Hello, according to the build logs the test 'AovTest.Aov_ObjectId' is failed. You should be able to reproduce this issue on Ubuntu/Windows machines. Looks like this is permanent issue for all of them. Try to launch Baikal unit tests

harokyang commented 6 years ago

Thanks, I will look into it

AlexanderVeselov commented 6 years ago

The AovTest.Aov_ObjectId test has failed because ClwSceneController::m_default_material has become a static field. Therefore, construction order for scene objects has changed. IDs for the objects are given one by one as they constructs, so now the test objects have other IDs => other AOV color.

harokyang commented 6 years ago

Yes, seems the only way to solve the issue is by update the unit test reference image Also I made a change again, m_default_material is now part of the scene properties A material should stay with data object instead of controller Data required to construct a inputmap should only come from the scene, not the controller

And no more static required when put m_default_material inside the scene object All controllers can generate the same inputmap this way