Closed darksylinc closed 5 months ago
@iche033 Quick question. Camera::Update
says:
/// \brief Renders a new frame.
/// This is a convenience function for single-camera scenes. It wraps the
/// pre-render, render, and post-render into a single
/// function. This should be used in applications with multiple cameras
/// or multiple consumers of a single camera's images.
public: virtual void Update() = 0;
I believe it should say "This shouldn't be used in applications with multiple cameras" ?
Because this function should be used for apps with a single camera, or multiple consumers of a single camera images. But it shouldn't be used in applications with multiple cameras.
Figured out a more user friendly solution to efficient loops of cameras i.e. the while( cameras.empty() )
thing:
All Cameras should call FlushSomething() (name pending) inside Camera::Render
for every camera. This happens inside Gazebo so the user doesn't have to change its update loops.
However the Flush function will only flush if it gets called every N times.
The user would still have to call Scene::PostRenderGpuFlush
at the end of everything, but aside from that, the user can call:
Scene::SetNumCameraPassesPerFlush( uint value ) where value must be >= 0
A value of 0 means we flush every camera (old behavior) and no need to call Scene::PostRenderGpuFlush
at the end of everything.
A value > 1 is new behavior, and user must call Scene::PostRenderGpuFlush
at the end of everything.
With these changes, a user updating like this:
scene->PreRender();
for every camera
camera->Render();
for every camera
camera->PostRender();
scene->PostRenderGpuFlush();
should achieve optimum performance
I believe it should say "This shouldn't be used in applications with multiple cameras" ?
yes :)
Figured out a more user friendly solution to efficient loops of cameras i.e. the while( cameras.empty() ) thing:
sounds good! I'm assuming the idea is to keep the value at 0 as default behavior and let advanced users call SetNumCameraPassesPerFlush
with value > 0?
sounds good! I'm assuming the idea is to keep the value at 0 as default behavior and let advanced users call SetNumCameraPassesPerFlush with value > 0?
The idea is that default is 0 yes. Though also my idea is that most samples show this value not being set to 0 to show to how to use, and new users (rather than just advanced once) set this to non-0; since it's not really that advanced.
This value being 0 is to ease porting of existing users, where failing to call PostRenderGpuFlush results in leaks or other weird behavior (and the complexity of existing codebases is unknown but presumably large).
With my recent changes in my branch, GpuRays/GpuRaysTest.RaysParticles
test started failing so I took a deeper look to fix it.
Turns out this change will fix a subtle bug caused by Ogre2GpuRays; therefore this fix is no longer merely a "performance improvement" and now it is a bugfix too (only present when using ogre2):
Technically the problem is global e.g. rendering to a two regular cameras and then to all 6 faces of a Ogre2GpuRays means the particles can be in up to 8 different time-states instead of 1.
I have a question for you @iche033 about the default value of SetNumCameraPassesPerGpuFlush
I made a couple changes to the proposed ones, which is less intrusive than what was originally proposed.
The current changes are like the following:
Scene::PostRender
function.Scene::PreRender
is called twice in a row without calling Scene::PostRender
, or PostRender gets called twice in a row without PreRender, Ignition will assert, which immediately notifies the user there's something wrong that needs their attention.Given that we can warn the user what's wrong (i.e. they didn't update their code to call Scene::PostRender), it may be better to set a default value for SetNumCameraPassesPerGpuFlush somewhere between [1; 6] rather than defaulting to legacy value of 0
What do you think?
The changes sound good to me. We should probably target these changes to main
since it's a behavior / API usage change, and also update the Migration guide.
If Scene::PreRender is called twice in a row without calling Scene::PostRender, or PostRender gets called twice in a row without PreRender, Ignition will assert,
I am thinking of a way to ease transition for users. Instead of an assert, we could print out an error msg like you're already doing, and in addition, we either call PostRender
for them (if PreRender is called twice) or ignore the duplicate PostRender
call (if PostRender is called twice)? For the Camera::Update
function, we could just call PostRender
for them - I think you're already doing that in your branch.
Given that we can warn the user what's wrong (i.e. they didn't update their code to call Scene::PostRender), it may be better to set a default value for SetNumCameraPassesPerGpuFlush somewhere between [1; 6] rather than defaulting to legacy value of 0
yeah I think that's fine. How about a default value to 1? For users wanting to use more cameras, they'll then need to change this value. We'll likely need a small tutorial teaching users how to use these APIs later
Some other minor comments on APIs:
SetNumCameraPassesPerGpuFlush
-> SetCameraPassCountPerGpuFlush
. To be consistent with Ignition naming convention, we use Count
in place of Num
, e.g. hereGetLegacyAutoGpuFlush
-> LegacyAutoGpuFlush
. Ignition prefers leaving out the Get
prefix in function nameI am thinking of a way to ease transition for users. Instead of an assert, we could print out an error msg like you're already doing, and in addition, we either call
PostRender
for them (if PreRender is called twice) or ignore the duplicatePostRender
call (if PostRender is called twice)? For theCamera::Update
function, we could just callPostRender
for them - I think you're already doing that in your branch.
We can do that, but issuing some Ogre commands before PostRender (e.g. creating/destroying a light between PreRender/PostRender is a no-go*) might cause trouble, hence the message must put emphasis in the importance of doing it.
yeah I think that's fine. How about a default value to 1? For users wanting to use more cameras, they'll then need to change this value. We'll likely need a small tutorial teaching users how to use these APIs later
I should clarify this better in the docs: If you set a value of e.g. 6, but you have a single camera and call PostRender, this will force a flush and hence having a value of 1 or 6 won't matter as the result will be exactly the same (in every term: performance, memory consumption).
A value of 6 is like an upper bound. We may queue up to 6 render passes or less; but never more.
But that does not mean a value of 6 unnecessarily wastes 6x more RAM than a value of 1 if you only have one regular camera.
Some other minor comments on APIs:
* `SetNumCameraPassesPerGpuFlush` -> `SetCameraPassCountPerGpuFlush`. To be consistent with Ignition naming convention, we use `Count` in place of `Num`, e.g. [here](https://github.com/ignitionrobotics/ign-rendering/blob/ign-rendering5/include/ignition/rendering/Scene.hh#L188) * `GetLegacyAutoGpuFlush` -> `LegacyAutoGpuFlush`. Ignition prefers leaving out the `Get` prefix in function name
Got it.
I should clarify this better in the docs: If you set a value of e.g. 6, but you have a single camera and call PostRender, this will force a flush and hence having a value of 1 or 6 won't matter as the result will be exactly the same (in every term: performance, memory consumption).
I see, thanks for the clarification, that makes sense.
OK!
So I did extensive testing(*) and it looks stable and working. Which surprised me.
It looks ready or almost ready to start merging.
But I do have a couple of questions on the merge:
Thus probably gazebo_scene_viewer demo is the only one I should be able to get working to see if it works, and then the code should be merged to main (repos to be merged are ign-rendering, -gazebo, -sensors)
(*)which of course means another user will encounter an obvious-unmissable problem the second they try it out
yes you can target main
branch and then we don't need to worry about ABI changes. Let me merge ign-rendering5
forward to main
first. Then you should be able to just merge your branch with main
and create the PR. That way, there won't be able unrelated changes in your PR. For other repos, I can also do the same but it can take some time depending on the changes that need to be merged forward. Alternatively, you can rebase your commits on top of main
and the create the PR so you don't have to wait for the merge forward.
ok lets give SetCameraPassCountPerGpuFlush = 6 a try. i. yes sounds about right. That makes me think that once the changes are merged in, we should probably have a small sample demo showing this new API usage. ii. and iii. Ok I'll take a look at the code when the PR is up, and test it with ign-gazebo.
for the samples:
i. yes, ogre2 doesn't work yet
ii. there is an issue with the gazebo_scene_viewer
sample on ign-rendering5: https://github.com/ignitionrobotics/ign-rendering/issues/290. I think that needs to be fixed first before we can test your changes. I would hold off on it for now. Maybe comment on the ticket and add a reminder to port the changes to the sample once its fixed.
iii. yes, ogre2 does not work yet
iv. yes, ogre2 does not work yet
Great!
I probably will have then to merge with SetCameraPassCountPerGpuFlush = 0 as default at least until ign-gazebo and ign-sensors get updated so that these modules are PostFrame-aware
Let me merge ign-rendering5 forward to main first
@iche033 is this fixed?
yes we can consider this as complete.
Current behavior
Currently Gazebo is doing, for every camera (of any type):
This is inefficient. Because:
Desired behavior
The code can be replaced with the following:
While
Ogre::SceneManager::updateSceneGraph
gets moved toScene::PreRender
and a GPU flush is moved toScene::PostRenderGpuFlush
(PostRenderGpuFlush currently doesn't exist)Exceptions
Scene::PostRenderGpuFlush
Flushes all buffered frame data and starts 'a new frame'.CPUs queue up a bunch of rendering commands and then waits for the GPU to finish up. If we flush too often, the CPU will often have to wait for the GPU doing nothing. If we flush too infrequently, RAM consumption will rise due to queueing up unsubmitted work.
This function should be called after updating all cameras / once per frame, but it may be called more often if you're updating too many cameras and running out of memory.
Example:
108 is way too much, causing out of memory situations; so flushing once per probe may make better sense in this case.
Note however that Gazebo is currently flushing once per cubemap face (i.e. flushing 6x3 = 18 times). While with my proposal (once per cubemap) we would flush 3 times.
Reason
This is merely a performance improvement.
This paves the way for an additional improvement. For example currently Depth Sensors are doing the following:
This negates our optimization proposed in this ticket because PostRender will force the CPU to wait for the GPU. The render loop can be improved to the following:
This allows the CPU to do more work while the GPU is rendering the previous cameras in parallel
N is an arbitrary number, where it depends on available memory and scene type (large values of N queue up too much, low values cause too much stall).
Impact
This is an ABI and API breaking change.
Other components need to change so they also call PostRenderGpuFlush (usually looking for calls to Scene::PreRender is a good start).
e.g. ign-sensors needs to add a call to PostRenderGpuFlush
Users must call PostRenderGpuFlush at some point or else errors or out of memory conditions may occur.
Because of this, a boolean will be added for migrating users. This boolean will control whether the old behavior or the new behavior is used.
In the old behavior, Gazebo will issue one updateSceneGraph and one PostRenderGpuFlush per Camera (default) which is what Gazebo is currently doing. In the new behavior, Gazebo assumes the user properly calls PreRender and PostRenderGpuFlush when appropriate Samples need to be updated to reflect this change
Risks
Implementation suggestion
I'm the one implementing this feature. This ticket is for tracking progress.
Progress
Progress can be tracked at:
Blocking problems:
With cameraPassCountPerGpuFlush = 6
With cameraPassCountPerGpuFlush = 0