Open xubo245 opened 7 years ago
Hi @xubo245 !
Sorry for the slow reply; I hadn't seen this issue when it came in! Are you running this locally? If you run 5 times, what do the runtimes look like? It is possible that you're seeing a warmup phenomena (e.g., file system buffering).
I run avocado in cluster with Spark standalone.
run 5 time:
am:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c16000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c16000000Nhs20Paired12time1000num32k1DiscoverVariantI1.adam time: 308.092
Apr 15, 2017 9:24:20 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 15, 2017 9:29:19 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c16000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c16000000Nhs20Paired12time1000num32k1DiscoverVariantI2.adam time: 295.197
Apr 15, 2017 9:29:30 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 15, 2017 9:34:17 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c16000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c16000000Nhs20Paired12time1000num32k1DiscoverVariantI3.adam time: 296.947
Apr 15, 2017 9:34:29 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 15, 2017 9:39:17 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c16000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c16000000Nhs20Paired12time1000num32k1DiscoverVariantI4.adam time: 301.25
Apr 15, 2017 9:39:29 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 15, 2017 9:44:22 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c16000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c16000000Nhs20Paired12time1000num32k1DiscoverVariantI5.adam time: 935.559
Apr 15, 2017 9:44:33 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 15, 2017 10:00:01 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI1.adam time: 1019.173
Apr 15, 2017 10:00:12 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 15, 2017 10:17:03 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI2.adam time: 1043.837
Apr 15, 2017 10:17:14 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 15, 2017 10:34:30 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI3.adam time: 331.459
Apr 15, 2017 10:34:41 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 15, 2017 10:40:04 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI4.adam time: 1034.815
Apr 15, 2017 10:40:16 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 15, 2017 10:57:23 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI5.adam time: 1034.985
Apr 15, 2017 10:57:34 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 15, 2017 11:14:41 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI1.adam time: 1142.983
Apr 15, 2017 11:14:53 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 15, 2017 11:33:47 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI2.adam time: 1148.81
Apr 15, 2017 11:33:59 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 15, 2017 11:52:59 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI3.adam time: 1161.003
Apr 15, 2017 11:53:10 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 15, 2017 12:12:23 PM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI4.adam time: 1122.803
Apr 15, 2017 12:12:35 PM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 15, 2017 12:31:09 PM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI5.adam time: 1139.485
Apr 15, 2017 12:31:21 PM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
I run 20 times:
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI11.adam time: 611.29
Apr 16, 2017 12:07:07 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 12:17:09 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI12.adam time: 1041.574
Apr 16, 2017 12:17:21 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 12:34:34 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI13.adam time: 1046.914
Apr 16, 2017 12:34:45 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 12:52:04 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI14.adam time: 978.584
Apr 16, 2017 12:52:16 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 1:08:26 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI15.adam time: 1017.985
Apr 16, 2017 1:08:37 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 1:25:25 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI16.adam time: 1037.953
Apr 16, 2017 1:25:38 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 1:42:47 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI17.adam time: 1026.294
Apr 16, 2017 1:42:59 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 1:59:57 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI18.adam time: 1000.112
Apr 16, 2017 2:00:08 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 2:16:40 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI19.adam time: 1031.265
Apr 16, 2017 2:16:52 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 2:33:54 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI20.adam time: 1033.526
Apr 16, 2017 2:34:05 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 2:51:10 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI21.adam time: 335.346
Apr 16, 2017 2:51:21 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 2:56:48 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI22.adam time: 333.994
Apr 16, 2017 2:57:00 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 3:02:25 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI23.adam time: 1011.96
Apr 16, 2017 3:02:37 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 3:19:20 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI24.adam time: 1006.177
Apr 16, 2017 3:19:32 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 3:36:10 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI25.adam time: 1038.076
Apr 16, 2017 3:36:22 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 3:53:31 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI26.adam time: 1030.243
Apr 16, 2017 3:53:42 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 4:10:44 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI27.adam time: 1033.402
Apr 16, 2017 4:10:56 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 4:28:01 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI28.adam time: 1017.483
Apr 16, 2017 4:28:13 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 4:45:02 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI29.adam time: 1007.373
Apr 16, 2017 4:45:14 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 5:01:53 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c18000000Nhs20Paired12time1000num32k1DiscoverVariantI30.adam time: 902.883
Apr 16, 2017 5:02:04 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 5:16:59 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI11.adam time: 1116.18
Apr 16, 2017 5:17:11 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 5:35:38 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI12.adam time: 1086.454
Apr 16, 2017 5:35:49 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 5:53:47 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI13.adam time: 1109.689
Apr 16, 2017 5:53:59 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 6:12:20 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI14.adam time: 1130.608
Apr 16, 2017 6:12:31 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 6:31:14 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI15.adam time: 1146.735
Apr 16, 2017 6:31:25 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 6:50:23 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI16.adam time: 1141.368
Apr 16, 2017 6:50:35 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 7:09:28 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI17.adam time: 1136.241
Apr 16, 2017 7:09:39 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 7:28:27 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI18.adam time: 1144.389
Apr 16, 2017 7:28:38 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 7:47:34 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI19.adam time: 1138.622
Apr 16, 2017 7:47:46 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 8:06:36 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI20.adam time: 1119.333
Apr 16, 2017 8:06:48 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 8:25:18 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI21.adam time: 360.353
Apr 16, 2017 8:25:30 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 8:31:22 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI22.adam time: 1101.976
Apr 16, 2017 8:31:34 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 8:49:47 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI23.adam time: 1183.02
Apr 16, 2017 8:49:58 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 9:09:34 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI24.adam time: 1088.011
Apr 16, 2017 9:09:45 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 9:27:45 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI25.adam time: 1115.471
Apr 16, 2017 9:27:56 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 9:46:23 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI26.adam time: 1134.819
Apr 16, 2017 9:46:34 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 10:05:21 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI27.adam time: 1127.239
Apr 16, 2017 10:05:32 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 10:24:11 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI28.adam time: 1122.376
Apr 16, 2017 10:24:23 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 10:42:56 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI29.adam time: 369.243
Apr 16, 2017 10:43:08 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 10:49:09 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
sam:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1.adam out:/xubo/project/alignment/CloudBWA/g38/time/cloudBWAnewg38L50c20000000Nhs20Paired12time1000num32k1DiscoverVariantI30.adam time: 1133.132
Apr 16, 2017 10:49:20 AM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 32
Apr 16, 2017 11:08:05 AM INFO: org.apache.parquet.hadoop.ParquetFileReader: Initiating action with parallelism: 5
Oh yeah, those times are really all over the map! Do you have the job history server enabled on your Spark cluster? If so, I am wondering if you have any failed tasks during the slower runs?
I use DiscoverVariants(org.bdgenomics.avocado.cli.DiscoverVariants) to discover Variant, the data is 8 million PE reads (by wgsim)
But the runtime has big difference between repeated run with the same args.
I try many time.
code:
time: