chili-epfl / chilitags

Robust Fiducial Markers for Augmented Reality And Robotics
http://chili.epfl.ch/software
123 stars 57 forks source link

Simple cross-platform profiling #59

Closed ayberkozgur closed 9 years ago

ayberkozgur commented 9 years ago

I think this is something that would make our lives much easier when trying to find bottlenecks. Works on desktop and Android. Use it as such:

BEGIN_PROFILING
some_function()
END_PROFILING("some_function()")

It prints how much time some_function() took in a nice way. You can put any amount of code between the macros and you can call them multiple times one after the other.

ayberkozgur commented 9 years ago

I'm thinking of putting a bit more on top of this (like option for averaging over time for periodic stuff and maybe nesting) and releasing it as a standalone lib. What do you think?

severin-lemaignan commented 9 years ago

I think there are probably thousands of clever macros for this kind of thing (are they?), so a separate library may be a bit overkill :-)

ayberkozgur commented 9 years ago

For Android there's only this thing: https://code.google.com/p/android-ndk-profiler/ and it doesn't work in multilib scenarios like ours.

The thing is that people always want full profiling capabilities: enable profiling and view the cache dump afterwards. There's no real-time profiling capabilities in this scenario, not to mention requiring debug symbols and slowing down of original code. Not to mention valgrind doesn't fully work in Android.

I also thought there should be macro-based simple profiling tools out there but I was unable to find any.

ayberkozgur commented 9 years ago

Well, there's this: https://github.com/CedricGuillemet/libProfiler and it has multithread support but I think I can do a bit better and think about multithreadedness later.

qbonnard commented 9 years ago

I thought I read somewhere that profiling this way was wrong somehow.. I was expecting our favourite troll to come out of his cavern for this reason ;) How about a global CMake variable to disable all profiling ?

ayberkozgur commented 9 years ago

@qbonnard You're right, this doesn't give 100% accurate results. Off the top of my head, the following reasons would cause this inaccuracy:

  1. Overhead of actually calling clock() and other stuff that's not part of the profiled code: We probably wouldn't care about this in most scenarios. But it's true that this might disturb the caches and affect program speed.
  2. Thread switching between clock() calls: This has the potential to really disturb the measurements.
  3. Inaccuracy of clock() itself. According to http://www.guyrutenberg.com/2007/09/10/resolution-problems-in-clock/ clock() is really not safe to use. Though I was able to measure even 0.01 ms with clock(), I'll take his advice and switch to clock_gettime(). This would also have the potential to solve (2) with CLOCK_THREAD_CPUTIME_ID. However, http://stackoverflow.com/questions/11210063/clock-gettime-can-not-update-instantly states that it might cause problems on Android. I'll do testing tomorrow on various devices we have as well as emulators.

But in the end, the way I see it, profiling boils down to using one of two methods: Either you hook each and every instruction and count, or you trust CPU timers (or other hardware mechanisms). I think in many scenarios we wouldn't care for blocks of code that take less than the order of 0.1 ms, so the second method would be satisfying as long as it gives us that much accuracy.

ayberkozgur commented 9 years ago

I implemented the said "profiler"; it can be found under https://github.com/chili-epfl/easy-performance-analyzer. On @severin-lemaignan's advice, changed the name from profiler to performance analyzer.