Closed arporter closed 8 years ago
I've extended dl_microbench to include benchmarks for MIN and MAX. Running these on an E5-1620v.2 with the Intel compiler shows that they do not count as a FLOP. Assuming that they pretty much consist of the CMP instruction, I've assumed they get executed on port 5 (Agner says it can be done on 0, 1, or 5 but 0 and 1 will generally be busy for the floating-point heavy codes we're looking at) of the Intel Ivy Bridge and cost 1 cycle.
Created branch support_min_max for work on this issue.
Both MIN and MAX can take an arbitrary no. of arguments (>= 2).
Only modified source files are dag.py and config_ivy_bridge.py. Coverage of latter is 100%. Coverage of former is 97%. Missed lines are due to FMA, latency and division operands. None due to this work.
All changed files are pep8 and pylint clean.
Branch merged to master. All tests pass. Closing issue.
We don't currently recognise MAX and MIN as Fortran intrinsics. We need to add them (including an estimation of their cost in FLOPs and cycles).