airair / graphlabapi

Automatically exported from code.google.com/p/graphlabapi
0 stars 0 forks source link

Metrics Reporting #2

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Metrics reporting are current extremely scattered.
update-counts are read from the scheduler, runtimes are read from the 
core/engine, distributed metrics are read from distributed_metrics, etc.

I suggest a centralized metrics storage class.

The "metrics" class should manage only a "human-readable" string->string 
mapping which is easily outputable to screen/disk.

i.e.
"update-counts" -> "126431"
etc.

Original issue reported on code.google.com by yucheng...@gmail.com on 12 Oct 2010 at 5:52

GoogleCodeExporter commented 9 years ago
distributed_metrics class does this, but only for numbers. We could enhance it.

Original comment by a...@sulake.com on 12 Oct 2010 at 7:46

GoogleCodeExporter commented 9 years ago
Yup. enhance and generalize and use it in everywhere, not just in distributed.

Original comment by yucheng...@gmail.com on 13 Oct 2010 at 4:08

GoogleCodeExporter commented 9 years ago

Original comment by yucheng...@gmail.com on 3 Nov 2010 at 8:50

GoogleCodeExporter commented 9 years ago

Original comment by yucheng...@gmail.com on 3 Nov 2010 at 8:50

GoogleCodeExporter commented 9 years ago

Original comment by akyrola...@gmail.com on 29 Nov 2010 at 9:33

GoogleCodeExporter commented 9 years ago
 have now implemented a metrics system that allows easy reporting
from apps and provides standard reporting from engine. Reports can
be output to stdout, file or html. It is easy to make own "reporters", for
example for storing the results in database. It also supports nicely
creating many engines during a session: each engine will have its
own report (with names engine 2, engine 3...).

More for the usage below.

====== GETTING METRICS =====

Command line parameter
     --metrics [basic | file | html | none]. Default is none.
Try it!  For "file", a file named graphlab_metrics.txt is created, for "html"
graphlab_metrics.html.  Basic outputs to standard output.

===== RECORDING METRICS =====

1.  Create metrics instance.

To collect metrics from your app or your component, you
create a metrics instance. For example, I wanted to collect some metrics
from pagerank, so I created instance with name "app:pagerank". 
Engine uses instance "engine", multiqueue fifo "multiqueue_fifo".
Example:

    graphlab::metrics &  app_metrics = graphlab::metrics::create_metrics_instance("app::pagerank");

2.  You can record four types of metrics:  integers, real values, timings and
strings.  For integers and real values you can record cumulative counts (for 
example
for task counts) or single counts (for example the number of threads). 
Timings are cumulative.  For each cumulative metric, sum, average, min and max 
is
recorded. Strings are strings.

Each metric entry is identified by a string key, for example "running_time".
Here are some examples from the pagerank app and asynchronous engine class:

    app_metrics.start_time("load");
    // ---- do graph loading
    app_metrics.stop_time("load");

    engine_metrics.set("num_vertices", graph.num_vertices());

    // Record update counts for each thread. It is a cumulative metric, and it is useful
    // to see if the task allocation was even for each worker thread.
    for(size_t i = 0; i < update_counts.size(); ++i) {
        engine_metrics.add("updatecount", update_counts[i], INTEGER);
      }

    engine_metrics.set("termination_reason", exec_status_as_string(termination_reason)

3.  Get report

All metrics instances created are automatically included in the report,
if the --metrics flag has been set.  The dump is done at the destructor
of core. You can also do a report yourself:
      file_reporter freporter = file_reporter("graphlab_metrics.txt");
         metrics::report_all(freporter);

4. Notes

Metrics system is thread-safe. This also means it has some overhead. So do not 
add a metric for every task, instead collect a cumulative metric yourself and
tell the metric system after the application has finished. 

====== STANDARD METRICS =====

Asynchronous engine records:
    runtime, updatecounts, num_vertices, num_edges, termination_reason.

Core records:
    all engine parameters
    compile options.

Multiqueue fifo records:
    prune counts

I encourage adding metrics to other schedulers as well.

====== CREATING YOUR OWN METRICS REPORTERS====

Easy. See for example metrics/reporters/file_reporter.hpp
But you need to do the report yourself (unless you want to add a choice
to the --metrics flag).
      my_reporter freporter = my_reporter();
         metrics::report_all(freporter);

===== NOTES =====

I did not try to make the most beatiful framework, just an easy-to-use 
no-configure
component.  Hope this is fine! :).

Original comment by akyrola...@gmail.com on 29 Nov 2010 at 9:34