tkdagdelen / gnocchi_old

Apache License 2.0
1 stars 2 forks source link

Gnocchi Timing #31

Closed nathanielparke closed 7 years ago

nathanielparke commented 8 years ago

Uses the Big Data Genomics Utils library to wrap functionality in timers. Currently prints results to standard out.

nathanielparke commented 8 years ago

Below is an example of how things are printing out.

+---------------------------------------+--------------+--------------+-------------+-------+-----------+-----------+-----------+
|                Metric                 | Worker Total | Driver Total | Driver Only | Count |   Mean    |    Min    |    Max    |
+---------------------------------------+--------------+--------------+-------------+-------+-----------+-----------+-----------+
| └─ Convert VCF to Adam format         |            - |     1.7 secs |    1.7 secs |     1 |  1.7 secs |  1.7 secs |  1.7 secs |
|     └─ Save File In ADAM Format       |            - |    1.52 secs |   1.52 secs |     1 | 1.52 secs | 1.52 secs | 1.52 secs |
| └─ Loading Parquet File to Data Frame |            - |    111.19 ms |   111.19 ms |     1 | 111.19 ms | 111.19 ms | 111.19 ms |
| └─ MinD dataframe filter operation    |            - |    375.31 ms |   375.31 ms |     1 | 375.31 ms | 375.31 ms | 375.31 ms |
| └─ Geno dataframe filter operation    |            - |    1.53 secs |   1.53 secs |     1 | 1.53 secs | 1.53 secs | 1.53 secs |
| └─ Load Phenotype operation           |            - |    328.19 ms |   328.19 ms |     1 | 328.19 ms | 328.19 ms | 328.19 ms |
| └─ keyBy at SiteRegression.scala:40   |            - |     25.62 ms |           - |     1 |  25.62 ms |  25.62 ms |  25.62 ms |
|     └─ function call                  |    475.02 µs |            - |           - |    50 |    9.5 µs |   6.29 µs |   60.6 µs |
+---------------------------------------+--------------+--------------+-------------+-------+-----------+-----------+-----------+

Spark Operations
+----------+----------------------------------+---------------+----------------+--------------+----------+
| Sequence |            Operation             | Is New Stage? | Stage Duration | Driver Total | Stage ID |
+----------+----------------------------------+---------------+----------------+--------------+----------+
| 1        | keyBy at SiteRegression.scala:40 | true          |        -185 ms |     25.62 ms | 13       |
+----------+----------------------------------+---------------+----------------+--------------+----------+```