van-smith commented 13 years ago

I created this issue to discuss molecule structure and scoring methodology. Please discuss and I will revise as needed.

Scoring summary

An Official Run will be comprised of three OPBM runs. An OPBM run is currently at Scenario (crystal) level, but we need to support one level higher to give us aggregation flexibility.

An Official Run score will be the arithmetic mean (average) of three OPBM crystal runs.

Here is a description of the various aggregation level scores:

Atom: The atomic score will be the geometric mean of internal operations.

Molecule: The molecular score will be the geometric mean of internal Atoms.

Scenario (crystal): The crystalline score will be the geometric mean of the internal Molecules.

Optional Suite (stone): The suite score will be the geometric mean of the internal crystal.

When drilling down from a multi-iteration run like an Official Run, display a composite set of molecules. In other words:

Official Run score -- Molecule 1 composite score -- Atom 1 composite score -- Atom 2 composite score ... -- Atom n composite score -- Molecule 2 composite score -- Atom 1 composite score -- Atom 2 composite score .. -- Atom n composite score ...
-- Molecule n composite score -- Atom 1 composite score -- Atom 2 composite score ... -- Atom n composite score

To simplify our efforts for the time being, drilling down into the Official Run score should yield a table of results like this:

My Official Run of OPBM

test run 1 run 2 run 3 cv Official Run (average) OPBM 93 95 94 0.93% 94 Molecule1 97 100 97 1.48% 98 Atom1 105 106 104 0.95% 105 Atom2 104 106 102 1.92% 104 Atom3 99 104 100 2.62% 101 Molecule2 83 85 85 1.29% 85 Atom1 87 90 90 1.95% 89 Atom2 66 70 68 2.94% 68 Atom3 84 84 85 0.68% 84 Molecule3 100 101 100 0.25% 100 Atom1 120 119 120 0.61% 120 Atom2 77 78 77 0.75% 77 Atom3 109 110 109 0.53% 109

There is no need to provide a tree view right now.

Organization of molecules (this is a dynamic list)

Example

Preferred scoring aggregations (might not be possible given time constraints)

Molecule –- Atom1 -- Atom2 ... -- AtomN

Install / uninstall applications –- Chrome –- Opera -- Firefox -- Safari -- Adobe Reader -- Adobe Flash -- 7zip

File operations -- Create files -- Copy files -- 7zip: zip files -- 7zip: unzip files

Word processing -- Alice -- Patrick Henry speech (typing)

Spreadsheet -- Excel Heat -- Excel Disaster: Sort, Filter, Chart

Database -- Access Gold, Silver, Oil, Dollar, Stock values

Presentation -- PowerPoint ARM versus x86

Publisher -- Two-page HEDGE flyer

Open/Close Office applications -- Open Word -- Open Word2 -- Open Excel -- Open Excel2 -- Open Access -- Open PowerPoint -- Open Publisher -- Close Word -- Close Excel -- Close Access -- Close PowerPoint -- Close Publisher

Open/Close Applications -- Open Firefox -- Open Chrome -- Open Opera -- Open Internet Explorer9 -- Open 7zip -- Close Firefox -- Close Chrome -- Close Opera -- Close IE9 -- Close 7zip

Multitasking -- Play video while unzipping files -- Play video while media encoding? -- Other?

Browser/JavaScript -- IE9: GoogleV8, SunSpider, Kraken -- Chrome: GoogleV8, SunSpider, Kraken -- Opera: GoogleV8, SunSpider, Kraken -- Firefox: GoogleV8, SunSpider, Kraken -- Safari: GoogleV8, SunSpider, Kraken

Adobe Flash -- Adobe Flash benchmark

Java (fully threaded) -- Sort -- SHA256 -- Zip files? (http://www.exampledepot.com/egs/java.util.zip/CreateZip.html)

HTML5 -- HTML5 benchmark

WebGL -- WebGL benchmark (http://www.khronos.org/webgl/wiki/Main_Page)

OpenCL -- SmallLuxGPU (http://www.luxrender.net/wiki/LuxMark)

Resource: http://openbenchmarking.org/tests/pts&s=r

van-smith commented 13 years ago

Functionality is adequate for 9.19 release.

For October, we need to add functionality (through tags) so that application launch results, etc. can be aggregated outside of their execution molecules, etc.

van-smith commented 13 years ago

We still need to add functionality (through tags) so that application launch results, etc. can be aggregated outside of their execution molecules, etc.

Bumping to 11.21.2011.

van-smith commented 13 years ago

Recent scoring discussions:

A score is generated for each operation (worklet). This score is produced by dividing the time it took to complete the operation on the reference system (tref) by the time it took to complete the operation on the SUT (tsut) during the current run. This ratio is then multiplied by 100 to produce a more human friendly number. The score is then the percentage performance of the reference SUT (e.g. a score of 100 means equal performance, 50 means half the performance and 200 means twice the performance). The formula for calculating the score of an operation is as follows: Score = 100 * (tref / tsut )
Scores are generated first and are averaged (arithmetic mean) over all iterations.
In OPBM, we define the most basic operations leveraged by the harness as “atoms”. Like atoms in nature, they may or may not be able to function independently. Atoms typically contain one or more instrumented operations (worklets). An atom's score is the geometric mean of its constituent operations.
Atoms combine to make molecules, molecules combine to make scenarios and scenarios combine to make suites. Everything above atoms is supposed to be able to run independently.
When aggregating scores in a superior level (e.g. when aggregating atom scores for a molecule), the inferior level results are combined through geometric mean. For instance: ScoreMyMolecule = GeometricMean( ScoreAtom1, ScoreAtom2, ScoreAtom3,… ScoreAtomn)
Scores for each aggregation level are shown in the Result Viewer. The exported CSV file should format results as closely to the Result Viewer as possible. The CSV file should have system discovery table and summary result tables (scenario/molecules, scenario/molecules/atoms, scenario/molecules/atoms/operations. A final table should contain timing data.
The Official Run score should be contained in the CSV results file. This could simply be the in the average in the scenario/molecules summary table.

van-smith / OPBM

Molecule structure, scoring methodology #34

Example