baldurk / renderdoc

RenderDoc is a stand-alone graphics debugging tool.
https://renderdoc.org
MIT License
8.89k stars 1.33k forks source link

Event Browser - Duration seems unreliable #65

Open srousseau1980 opened 10 years ago

srousseau1980 commented 10 years ago

While taking capture, I was interested to seem the general timing of a few specific draw calls, the timing is always 0.0 us (batch Im interested to profile). Its not a display issue since I display floating point up to 5 digits.

Im profiling a fullscreen quad (only 4 vertex), but rasterize at 1280 x 720 so enough pixel so I should see a cost higher than 0.00000 us.

The pixel shader execute 108 instructions, so its not super simple to explain the timing issue. In the 108 instructions I execute 10 texture2D read and 1 cubemap read.

Would be also nice if we had more visibility of the actual bootleneck (VS fetch cost, VS shader, pixel shader, texture fetch, output merger cost, etc...)

Thanks

Sebastien

baldurk commented 10 years ago

Sadly all I can say is - yes, it is unreliable :(.

While I don't think I'm doing anything wrong with my use of timestamp queries to get the drawcall durations, the actual data I get seems to be pretty unreliable - moreso at individual drawcall level than at a scene or pass level. It's something I absolutely want to fix, but at the moment I don't really have any good ideas how as I'm not sure what I'm doing wrong.

Once the basic timing is accurate, I plan to integrate the different IHV hardware counter libraries to get a breakdown on what's actually slow like you say.

srousseau1980 commented 10 years ago

If you ever get an ETA for both step (global timing + each pipeline timing) let me know. This tool is awesome anyway :)

Date: Fri, 30 May 2014 00:08:15 -0700 From: notifications@github.com To: renderdoc@noreply.github.com CC: seb_rousseau@hotmail.com Subject: Re: [renderdoc] Event Browser - Duration seems unreliable (#65)

Sadly all I can say is - yes, it is unreliable :(.

While I don't think I'm doing anything wrong with my use of timestamp queries to get the drawcall durations, the actual data I get seems to be pretty unreliable - moreso at individual drawcall level than at a scene or pass level. It's something I absolutely want to fix, but at the moment I don't really have any good ideas how as I'm not sure what I'm doing wrong.

Once the basic timing is accurate, I plan to integrate the different IHV hardware counter libraries to get a breakdown on what's actually slow like you say.

— Reply to this email directly or view it on GitHub.

baldurk commented 10 years ago

Getting reliable drawcall timings is definitely high on my priorities, but I'm kind of blocked right now by not really having a good idea of what to do next!

The profiling breakdown is lower on the priority list, as I'm focused on getting the debugger behaviour up to scratch first (it's still lacking in many ways) before looking at in-depth profiling functionality. Of course, patches are always welcome :).

weaseltron-ahirst commented 9 years ago

Really like the tool. Thanks!

The issue for me is that running the timing on the same grab is inconsistent. Also allowing the user to re-time the grab without reopening it would be great.

I understand other tools run each command multiple times and take an average, removing samples that seem excessively large (or small).

The GPU pipelines/queues multiple commands at the same time such that they overlap - so profiling a batch of calls will give a time closer to that in-game. Doing a start/stop timing for multiple draw calls one after the other (which isn't a bad idea) will take longer if it kills the pipelining.

There's also the performance impact of cache to think about - which only really can be measured by profiling the whole frame.

I like this idea: https://github.com/baldurk/renderdoc/issues/30 in conjunction with the current per-draw call timings.

baldurk commented 9 years ago

Glad you like the tool, thanks :smile:

You can hack this locally if you want to experiment, if you look at Core.cs line 594 you can see where the early-out is to not retime things. I did experiment with this (I think there's still some remnants in the code set to do a loop internally, that just loops once), but I found that you'd get 'bubbles' in timing that wouldn't shift even over multiple tries, or you'd get the times slowly shifting upwards in general.

I really do want to improve the profiling support - I was of two minds about whether to even include the current timing since it's so unreliable, but I think it's probably better than not including any timing functionality. I want to do it properly though rather than having some half-hearted hacks, and it's a big task. I won't be able to look at until some time next year, currently it's higher priority to get API support improved, like OpenGL.

Doing it properly though would include things like you say of timing pipelined vs. end-to-end times for drawcalls (as those are different values and tell you different things).

weaseltron-ahirst commented 9 years ago

Cheers for the reply.

Changing that line does the job. I'm not seeing the times slowly shift upwards or anything I'd particularly describe as 'bubbles' but I do usually see that the first timing run is about half the time of subsequent ones.

Pipelined draw timings are particularly applicable for perf event groups (ie for https://github.com/baldurk/renderdoc/issues/30) and 'end-to-end' timing for individual calls.

I see how API support is higher priority than timings on current APIs. If I had the time ... thanks for yours.

baldurk commented 9 years ago

That may be what I saw, my memory is a bit hazy as it was last year some time that I experimented with this. I'll add a option in the settings window to override that behaviour - I only put it in to stop accidental clicks since it didn't seem like re-timing was helping at all.

MWFIAE commented 3 days ago

@baldurk Since the duration seems unreliable in terms of absolute numbers, but reliable in terms of relation to each other, maybe it makes sense to change it to percentages (or alternatively to permillage/permyriad) instead?

Still not perfect of course, but maybe it helps some. (Best of course if it is a setting so everybody get to use what they like best or just display both :D )