Measure: function call overhead

arvindm95 / unladen-swallow

Automatically exported from code.google.com/p/unladen-swallow

Other

0 stars 0 forks source link

Measure: function call overhead #46

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago

We need to have more accurate measurements for the total time taken from
when we start a CALL_FUNCTION opcode to the time when the body of the
function starts executing. This should work regardless of whether we're
dispatching to an interpreted function or an LLVM function; C functions can
be fudged a bit (right before calling the function pointer?)

- Data should be stored in a vector and stats printed out at Python-shutdown.
- This should include whether the execution is in machine code or the
interpreter.
- Should be a special build (controlled by #ifdef's).
- Use TSCs?

Original issue reported on code.google.com by collinw on 30 May 2009 at 1:46

GoogleCodeExporter commented 8 years ago

This should make a decent starter project for you, Reid.

Original comment by collinw on 1 Jun 2009 at 9:40

GoogleCodeExporter commented 8 years ago

Are we sure we wouldn't rather use clock_gettime?  Wikipedia has some pretty bad
things to say about the time stamp counter:
http://en.wikipedia.org/wiki/Time_Stamp_Counter

Some things that impact TSC:
- On AMD processors and earlier Intel processors, the TSC is CPU frequency 
dependent,
which can change dynamically in order to save power.  However, you can 
configure your
computer to run at a fixed frequency.
- If the executable changes cores, the TSC's are not synchronized, so it may 
not be
monotonic.  The only way around this is to "lock your process to a single core".

Original comment by reid.kle...@gmail.com on 4 Jun 2009 at 10:14

GoogleCodeExporter commented 8 years ago

clock_gettime isn't available on OSX, so we can't use it unconditionally. TSC 
jitter 
may be an issue, but (1) we're only trying to get relative timings between the 
eval 
loop and LLVM, not trying to get absolute timings we can share across machines, 
and 
(2) changing processors should cause large TSC jumps (among other things, it's 
a 1+us 
context switch), which the analysis script can filter out.

Original comment by jyass...@gmail.com on 4 Jun 2009 at 11:45

GoogleCodeExporter commented 8 years ago

Fixed in r611.

Original comment by collinw on 5 Jun 2009 at 7:23

Changed state: Fixed