Closed nathanielparke closed 7 years ago
Below is an example of how things are printing out.
+---------------------------------------+--------------+--------------+-------------+-------+-----------+-----------+-----------+
| Metric | Worker Total | Driver Total | Driver Only | Count | Mean | Min | Max |
+---------------------------------------+--------------+--------------+-------------+-------+-----------+-----------+-----------+
| └─ Convert VCF to Adam format | - | 1.7 secs | 1.7 secs | 1 | 1.7 secs | 1.7 secs | 1.7 secs |
| └─ Save File In ADAM Format | - | 1.52 secs | 1.52 secs | 1 | 1.52 secs | 1.52 secs | 1.52 secs |
| └─ Loading Parquet File to Data Frame | - | 111.19 ms | 111.19 ms | 1 | 111.19 ms | 111.19 ms | 111.19 ms |
| └─ MinD dataframe filter operation | - | 375.31 ms | 375.31 ms | 1 | 375.31 ms | 375.31 ms | 375.31 ms |
| └─ Geno dataframe filter operation | - | 1.53 secs | 1.53 secs | 1 | 1.53 secs | 1.53 secs | 1.53 secs |
| └─ Load Phenotype operation | - | 328.19 ms | 328.19 ms | 1 | 328.19 ms | 328.19 ms | 328.19 ms |
| └─ keyBy at SiteRegression.scala:40 | - | 25.62 ms | - | 1 | 25.62 ms | 25.62 ms | 25.62 ms |
| └─ function call | 475.02 µs | - | - | 50 | 9.5 µs | 6.29 µs | 60.6 µs |
+---------------------------------------+--------------+--------------+-------------+-------+-----------+-----------+-----------+
Spark Operations
+----------+----------------------------------+---------------+----------------+--------------+----------+
| Sequence | Operation | Is New Stage? | Stage Duration | Driver Total | Stage ID |
+----------+----------------------------------+---------------+----------------+--------------+----------+
| 1 | keyBy at SiteRegression.scala:40 | true | -185 ms | 25.62 ms | 13 |
+----------+----------------------------------+---------------+----------------+--------------+----------+```
Uses the Big Data Genomics Utils library to wrap functionality in timers. Currently prints results to standard out.