twitter / scalding

A Scala API for Cascading
http://twitter.com/scalding
Apache License 2.0
3.5k stars 706 forks source link

ExecutionApp should print out custom stats like Job #1764

Open benpence opened 6 years ago

benpence commented 6 years ago

Job prints out custom stats in #run(...); ExecutionApp does not. It might be useful to get the stats across an entire Execution run, so I think we should print these out (and expose this API point a little better).

I'm assuming the reason we didn't do this is an oversight. Or is there some design decision?

johnynek commented 6 years ago

you can access the stats in Execution so you can print these out yourself.

Execution.getAndResetCounters or getCounters + flatMap and println should be all you need.

You can do this in a function in your own repo:

def withCustomCounters[T](ex: Execution[T]): Execution[T] =
  for {
    t <- ex
    counters <- Execution.getCounters
    _ = printCustom(counters)
  } yield t

Something like that.

benpence commented 6 years ago

Ya I saw that when looking through it. Something along the lines of

job
  .getCounters
  .map {
    case (_, counters) =>
      println("Dumping custom counters:")

      counters.keys.foreach { key =>
        val group = key.group
        val counter = key.counter
        val value = counters(key)

        println(s"$counter\t$value")
      }
  }
  .waitFor(conf, mode).get

in ExecutionApp.main