broadinstitute / picard

A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.
https://broadinstitute.github.io/picard/
MIT License
969 stars 370 forks source link

picard CollectHsMetrics HS_LIBRARY_SIZE field empty #922

Closed wenluo711 closed 7 years ago

wenluo711 commented 7 years ago

Hi, I'm not sure if it's a bug or something wrong with my environment that causing the HS_LIBRARY_SIZE field to be empty in the output of picard CollectHsMetrics. My log file didn't show any error or warning message.

Bug Report

Affected tool(s)

picard CollectHsMetrics

Affected version(s)

Description

[Fri Sep 08 16:22:09 EDT 2017] picard.analysis.directed.CollectHsMetrics BAIT_INTERVALS=[120430_HG19_ExomeV3_UTR_EZ_HX1_bait.interval_list] TARGET_INTERVALS=[120430_HG19_ExomeV3_UTR_EZ_HX1_capture.interval_list] INPUT=/CGF/Sequencing/Illumina/HiSeq/PostRun_Analysis/Data/New_Pipeline_Test/160113_K00278_0016_AH5LJGBBXX/BAM/SC036872_CAGATC_L001_HQ_paired_dedup_properly_paired_nophix.bam OUTPUT=test_hs_metrics.txt REFERENCE_SEQUENCE=/CGF/Resources/Data/genome/hg19_canonical_correct_chr_order.fa MINIMUM_MAPPING_QUALITY=20 MINIMUM_BASE_QUALITY=20 CLIP_OVERLAPPING_READS=true METRIC_ACCUMULATION_LEVEL=[ALL_READS] NEAR_DISTANCE=250 COVERAGE_CAP=200 SAMPLE_SIZE=10000 VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json [Fri Sep 08 16:22:09 EDT 2017] Executing as luow2@cgemsIII on Linux 2.6.32-696.6.3.el6.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_111-b14; Picard version: 2.8.2-SNAPSHOT INFO 2017-09-08 16:22:55 CollectHsMetrics Processed 1,000,000 records. Elapsed time: 00:00:34s. Time for last 1,000,000: 34s. Last read position: chr1:17,257,740 INFO 2017-09-08 16:23:29 CollectHsMetrics Processed 2,000,000 records. Elapsed time: 00:01:08s. Time for last 1,000,000: 33s. Last read position: chr1:35,857,957 INFO 2017-09-08 16:24:03 CollectHsMetrics Processed 3,000,000 records. Elapsed time: 00:01:42s. Time for last 1,000,000: 33s. Last read position: chr1:55,534,944 INFO 2017-09-08 16:24:34 CollectHsMetrics Processed 4,000,000 records. Elapsed time: 00:02:14s. Time for last 1,000,000: 31s. Last read position: chr1:109,639,402 INFO 2017-09-08 16:25:05 CollectHsMetrics Processed 5,000,000 records. Elapsed time: 00:02:45s. Time for last 1,000,000: 30s. Last read position: chr1:150,463,027 INFO 2017-09-08 16:25:40 CollectHsMetrics Processed 6,000,000 records. Elapsed time: 00:03:19s. Time for last 1,000,000: 34s. Last read position: chr1:160,578,614 INFO 2017-09-08 16:26:13 CollectHsMetrics Processed 7,000,000 records. Elapsed time: 00:03:53s. Time for last 1,000,000: 33s. Last read position: chr1:196,876,241 INFO 2017-09-08 16:26:47 CollectHsMetrics Processed 8,000,000 records. Elapsed time: 00:04:27s. Time for last 1,000,000: 33s. Last read position: chr1:224,044,642 INFO 2017-09-08 16:27:21 CollectHsMetrics Processed 9,000,000 records. Elapsed time: 00:05:00s. Time for last 1,000,000: 33s. Last read position: chr2:11,804,589 INFO 2017-09-08 16:27:54 CollectHsMetrics Processed 10,000,000 records. Elapsed time: 00:05:33s. Time for last 1,000,000: 33s. Last read position: chr2:50,733,665 INFO 2017-09-08 16:28:26 CollectHsMetrics Processed 11,000,000 records. Elapsed time: 00:06:05s. Time for last 1,000,000: 32s. Last read position: chr2:91,767,755 INFO 2017-09-08 16:28:57 CollectHsMetrics Processed 12,000,000 records. Elapsed time: 00:06:36s. Time for last 1,000,000: 30s. Last read position: chr2:128,757,698 INFO 2017-09-08 16:29:28 CollectHsMetrics Processed 13,000,000 records. Elapsed time: 00:07:08s. Time for last 1,000,000: 31s. Last read position: chr2:176,043,108 INFO 2017-09-08 16:30:02 CollectHsMetrics Processed 14,000,000 records. Elapsed time: 00:07:42s. Time for last 1,000,000: 33s. Last read position: chr2:210,707,144 INFO 2017-09-08 16:30:36 CollectHsMetrics Processed 15,000,000 records. Elapsed time: 00:08:15s. Time for last 1,000,000: 33s. Last read position: chr2:242,034,334 INFO 2017-09-08 16:31:08 CollectHsMetrics Processed 16,000,000 records. Elapsed time: 00:08:48s. Time for last 1,000,000: 32s. Last read position: chr3:39,166,808 INFO 2017-09-08 16:31:44 CollectHsMetrics Processed 17,000,000 records. Elapsed time: 00:09:23s. Time for last 1,000,000: 35s. Last read position: chr3:53,263,213 INFO 2017-09-08 16:32:16 CollectHsMetrics Processed 18,000,000 records. Elapsed time: 00:09:55s. Time for last 1,000,000: 32s. Last read position: chr3:119,308,695 INFO 2017-09-08 16:32:48 CollectHsMetrics Processed 19,000,000 records. Elapsed time: 00:10:28s. Time for last 1,000,000: 32s. Last read position: chr3:149,488,441 INFO 2017-09-08 16:33:22 CollectHsMetrics Processed 20,000,000 records. Elapsed time: 00:11:01s. Time for last 1,000,000: 33s. Last read position: chr3:195,412,749 INFO 2017-09-08 16:33:55 CollectHsMetrics Processed 21,000,000 records. Elapsed time: 00:11:35s. Time for last 1,000,000: 33s. Last read position: chr4:40,124,626 INFO 2017-09-08 16:34:27 CollectHsMetrics Processed 22,000,000 records. Elapsed time: 00:12:06s. Time for last 1,000,000: 31s. Last read position: chr4:91,646,072 INFO 2017-09-08 16:34:59 CollectHsMetrics Processed 23,000,000 records. Elapsed time: 00:12:38s. Time for last 1,000,000: 32s. Last read position: chr4:155,533,149 INFO 2017-09-08 16:35:31 CollectHsMetrics Processed 24,000,000 records. Elapsed time: 00:13:10s. Time for last 1,000,000: 31s. Last read position: chr5:35,728,076 INFO 2017-09-08 16:36:05 CollectHsMetrics Processed 25,000,000 records. Elapsed time: 00:13:45s. Time for last 1,000,000: 34s. Last read position: chr5:86,697,424 INFO 2017-09-08 16:36:38 CollectHsMetrics Processed 26,000,000 records. Elapsed time: 00:14:17s. Time for last 1,000,000: 32s. Last read position: chr5:139,864,171 INFO 2017-09-08 16:37:13 CollectHsMetrics Processed 27,000,000 records. Elapsed time: 00:14:53s. Time for last 1,000,000: 35s. Last read position: chr5:171,722,951 INFO 2017-09-08 16:37:47 CollectHsMetrics Processed 28,000,000 records. Elapsed time: 00:15:26s. Time for last 1,000,000: 33s. Last read position: chr6:26,184,009 INFO 2017-09-08 16:38:03 CollectHsMetrics Processed 29,000,000 records. Elapsed time: 00:15:42s. Time for last 1,000,000: 16s. Last read position: chr6:30,920,740 INFO 2017-09-08 16:38:17 CollectHsMetrics Processed 30,000,000 records. Elapsed time: 00:15:56s. Time for last 1,000,000: 13s. Last read position: chr6:32,797,886 INFO 2017-09-08 16:38:46 CollectHsMetrics Processed 31,000,000 records. Elapsed time: 00:16:25s. Time for last 1,000,000: 29s. Last read position: chr6:43,603,639 INFO 2017-09-08 16:39:13 CollectHsMetrics Processed 32,000,000 records. Elapsed time: 00:16:52s. Time for last 1,000,000: 26s. Last read position: chr6:107,114,190 INFO 2017-09-08 16:39:44 CollectHsMetrics Processed 33,000,000 records. Elapsed time: 00:17:23s. Time for last 1,000,000: 31s. Last read position: chr6:157,405,586 INFO 2017-09-08 16:40:05 CollectHsMetrics Processed 34,000,000 records. Elapsed time: 00:17:44s. Time for last 1,000,000: 20s. Last read position: chr7:29,523,587 INFO 2017-09-08 16:40:22 CollectHsMetrics Processed 35,000,000 records. Elapsed time: 00:18:01s. Time for last 1,000,000: 16s. Last read position: chr7:74,910,313 INFO 2017-09-08 16:40:49 CollectHsMetrics Processed 36,000,000 records. Elapsed time: 00:18:29s. Time for last 1,000,000: 27s. Last read position: chr7:106,271,146 INFO 2017-09-08 16:41:23 CollectHsMetrics Processed 37,000,000 records. Elapsed time: 00:19:02s. Time for last 1,000,000: 33s. Last read position: chr7:148,800,918 INFO 2017-09-08 16:41:56 CollectHsMetrics Processed 38,000,000 records. Elapsed time: 00:19:35s. Time for last 1,000,000: 33s. Last read position: chr8:27,529,661 INFO 2017-09-08 16:42:28 CollectHsMetrics Processed 39,000,000 records. Elapsed time: 00:20:08s. Time for last 1,000,000: 32s. Last read position: chr8:87,392,600 INFO 2017-09-08 16:43:01 CollectHsMetrics Processed 40,000,000 records. Elapsed time: 00:20:40s. Time for last 1,000,000: 32s. Last read position: chr8:145,255,733 INFO 2017-09-08 16:43:34 CollectHsMetrics Processed 41,000,000 records. Elapsed time: 00:21:13s. Time for last 1,000,000: 33s. Last read position: chr9:40,773,122 INFO 2017-09-08 16:44:04 CollectHsMetrics Processed 42,000,000 records. Elapsed time: 00:21:44s. Time for last 1,000,000: 30s. Last read position: chr9:108,155,896 INFO 2017-09-08 16:44:35 CollectHsMetrics Processed 43,000,000 records. Elapsed time: 00:22:14s. Time for last 1,000,000: 30s. Last read position: chr9:134,385,364 INFO 2017-09-08 16:44:52 CollectHsMetrics Processed 44,000,000 records. Elapsed time: 00:22:32s. Time for last 1,000,000: 17s. Last read position: chr10:28,730,210 INFO 2017-09-08 16:45:14 CollectHsMetrics Processed 45,000,000 records. Elapsed time: 00:22:53s. Time for last 1,000,000: 21s. Last read position: chr10:75,415,796 INFO 2017-09-08 16:45:47 CollectHsMetrics Processed 46,000,000 records. Elapsed time: 00:23:27s. Time for last 1,000,000: 33s. Last read position: chr10:105,147,977 INFO 2017-09-08 16:46:20 CollectHsMetrics Processed 47,000,000 records. Elapsed time: 00:24:00s. Time for last 1,000,000: 33s. Last read position: chr11:3,724,422 INFO 2017-09-08 16:46:55 CollectHsMetrics Processed 48,000,000 records. Elapsed time: 00:24:34s. Time for last 1,000,000: 34s. Last read position: chr11:43,284,354 INFO 2017-09-08 16:47:27 CollectHsMetrics Processed 49,000,000 records. Elapsed time: 00:25:07s. Time for last 1,000,000: 32s. Last read position: chr11:64,821,002 INFO 2017-09-08 16:48:01 CollectHsMetrics Processed 50,000,000 records. Elapsed time: 00:25:40s. Time for last 1,000,000: 33s. Last read position: chr11:89,939,387 INFO 2017-09-08 16:48:24 CollectHsMetrics Processed 51,000,000 records. Elapsed time: 00:26:04s. Time for last 1,000,000: 23s. Last read position: chr11:124,029,258 INFO 2017-09-08 16:48:40 CollectHsMetrics Processed 52,000,000 records. Elapsed time: 00:26:19s. Time for last 1,000,000: 15s. Last read position: chr12:13,061,606 INFO 2017-09-08 16:48:56 CollectHsMetrics Processed 53,000,000 records. Elapsed time: 00:26:35s. Time for last 1,000,000: 15s. Last read position: chr12:52,201,346 INFO 2017-09-08 16:49:15 CollectHsMetrics Processed 54,000,000 records. Elapsed time: 00:26:54s. Time for last 1,000,000: 19s. Last read position: chr12:75,784,976 INFO 2017-09-08 16:49:48 CollectHsMetrics Processed 55,000,000 records. Elapsed time: 00:27:27s. Time for last 1,000,000: 32s. Last read position: chr12:115,118,632 INFO 2017-09-08 16:50:21 CollectHsMetrics Processed 56,000,000 records. Elapsed time: 00:28:00s. Time for last 1,000,000: 33s. Last read position: chr13:30,778,036 INFO 2017-09-08 16:50:53 CollectHsMetrics Processed 57,000,000 records. Elapsed time: 00:28:32s. Time for last 1,000,000: 31s. Last read position: chr13:111,367,630 INFO 2017-09-08 16:51:25 CollectHsMetrics Processed 58,000,000 records. Elapsed time: 00:29:05s. Time for last 1,000,000: 32s. Last read position: chr14:51,219,552 INFO 2017-09-08 16:51:54 CollectHsMetrics Processed 59,000,000 records. Elapsed time: 00:29:33s. Time for last 1,000,000: 28s. Last read position: chr14:81,968,633 INFO 2017-09-08 16:52:25 CollectHsMetrics Processed 60,000,000 records. Elapsed time: 00:30:05s. Time for last 1,000,000: 31s. Last read position: chr15:28,991,217 INFO 2017-09-08 16:52:57 CollectHsMetrics Processed 61,000,000 records. Elapsed time: 00:30:37s. Time for last 1,000,000: 31s. Last read position: chr15:55,905,405 INFO 2017-09-08 16:53:30 CollectHsMetrics Processed 62,000,000 records. Elapsed time: 00:31:09s. Time for last 1,000,000: 32s. Last read position: chr15:83,438,720 INFO 2017-09-08 16:54:03 CollectHsMetrics Processed 63,000,000 records. Elapsed time: 00:31:42s. Time for last 1,000,000: 33s. Last read position: chr16:4,745,478 INFO 2017-09-08 16:54:35 CollectHsMetrics Processed 64,000,000 records. Elapsed time: 00:32:14s. Time for last 1,000,000: 31s. Last read position: chr16:30,775,620 INFO 2017-09-08 16:55:07 CollectHsMetrics Processed 65,000,000 records. Elapsed time: 00:32:46s. Time for last 1,000,000: 32s. Last read position: chr16:70,316,423 INFO 2017-09-08 16:55:41 CollectHsMetrics Processed 66,000,000 records. Elapsed time: 00:33:20s. Time for last 1,000,000: 33s. Last read position: chr17:4,890,702 INFO 2017-09-08 16:56:14 CollectHsMetrics Processed 67,000,000 records. Elapsed time: 00:33:53s. Time for last 1,000,000: 33s. Last read position: chr17:20,801,474 INFO 2017-09-08 16:56:47 CollectHsMetrics Processed 68,000,000 records. Elapsed time: 00:34:26s. Time for last 1,000,000: 32s. Last read position: chr17:39,968,746 INFO 2017-09-08 16:57:21 CollectHsMetrics Processed 69,000,000 records. Elapsed time: 00:35:00s. Time for last 1,000,000: 34s. Last read position: chr17:58,094,379 INFO 2017-09-08 16:57:41 CollectHsMetrics Processed 70,000,000 records. Elapsed time: 00:35:20s. Time for last 1,000,000: 20s. Last read position: chr17:79,687,872 INFO 2017-09-08 16:58:05 CollectHsMetrics Processed 71,000,000 records. Elapsed time: 00:35:44s. Time for last 1,000,000: 23s. Last read position: chr18:54,265,731 INFO 2017-09-08 16:58:38 CollectHsMetrics Processed 72,000,000 records. Elapsed time: 00:36:18s. Time for last 1,000,000: 33s. Last read position: chr19:9,062,073 INFO 2017-09-08 16:59:12 CollectHsMetrics Processed 73,000,000 records. Elapsed time: 00:36:52s. Time for last 1,000,000: 33s. Last read position: chr19:23,544,002 INFO 2017-09-08 16:59:47 CollectHsMetrics Processed 74,000,000 records. Elapsed time: 00:37:26s. Time for last 1,000,000: 34s. Last read position: chr19:44,932,695 INFO 2017-09-08 17:00:21 CollectHsMetrics Processed 75,000,000 records. Elapsed time: 00:38:00s. Time for last 1,000,000: 34s. Last read position: chr19:55,697,512 INFO 2017-09-08 17:00:50 CollectHsMetrics Processed 76,000,000 records. Elapsed time: 00:38:29s. Time for last 1,000,000: 28s. Last read position: chr20:30,227,693 INFO 2017-09-08 17:01:18 CollectHsMetrics Processed 77,000,000 records. Elapsed time: 00:38:58s. Time for last 1,000,000: 28s. Last read position: chr20:56,493,908 INFO 2017-09-08 17:01:51 CollectHsMetrics Processed 78,000,000 records. Elapsed time: 00:39:30s. Time for last 1,000,000: 32s. Last read position: chr21:45,187,589 INFO 2017-09-08 17:02:22 CollectHsMetrics Processed 79,000,000 records. Elapsed time: 00:40:01s. Time for last 1,000,000: 31s. Last read position: chr22:31,011,689 INFO 2017-09-08 17:02:50 CollectHsMetrics Processed 80,000,000 records. Elapsed time: 00:40:30s. Time for last 1,000,000: 28s. Last read position: chrX:5,809,703 INFO 2017-09-08 17:03:11 CollectHsMetrics Processed 81,000,000 records. Elapsed time: 00:40:50s. Time for last 1,000,000: 20s. Last read position: chrX:107,420,153 INFO 2017-09-08 17:03:33 TheoreticalSensitivity Creating Roulette Wheel INFO 2017-09-08 17:03:33 TheoreticalSensitivity Calculating quality sums from quality sampler INFO 2017-09-08 17:03:33 TheoreticalSensitivity 0 sampling iterations completed INFO 2017-09-08 17:03:34 TheoreticalSensitivity 1000 sampling iterations completed INFO 2017-09-08 17:03:36 TheoreticalSensitivity 2000 sampling iterations completed INFO 2017-09-08 17:03:37 TheoreticalSensitivity 3000 sampling iterations completed INFO 2017-09-08 17:03:38 TheoreticalSensitivity 4000 sampling iterations completed INFO 2017-09-08 17:03:39 TheoreticalSensitivity 5000 sampling iterations completed INFO 2017-09-08 17:03:40 TheoreticalSensitivity 6000 sampling iterations completed INFO 2017-09-08 17:03:42 TheoreticalSensitivity 7000 sampling iterations completed INFO 2017-09-08 17:03:43 TheoreticalSensitivity 8000 sampling iterations completed INFO 2017-09-08 17:03:44 TheoreticalSensitivity 9000 sampling iterations completed INFO 2017-09-08 17:03:45 TheoreticalSensitivity Calculating theoretical het sensitivity INFO 2017-09-08 17:03:46 TargetMetricsCollector Calculating GC metrics [Fri Sep 08 17:03:47 EDT 2017] picard.analysis.directed.CollectHsMetrics done. Elapsed time: 41.63 minutes. Runtime.totalMemory()=4589092864

Steps to reproduce

$which java /DCEG/Resources/Tools/jdk/8u111/jdk1.8.0_111/bin/java $java -Xmx16g -jar /DCEG/Resources/Tools/Picard/Picard-2.10.10/picard.jar CollectHsMetrics I=/CGF/Sequencing/Illumina/HiSeq/PostRun_Analysis/Data/New_Pipeline_Test/160113_K00278_0016_AH5LJGBBXX/BAM/SC036872_CAGATC_L001_HQ_paired_dedup_properly_paired_nophix.bam O=test_hs_metrics.txt BAIT_INTERVALS=120430_HG19_ExomeV3_UTR_EZ_HX1_bait.interval_list TARGET_INTERVALS=120430_HG19_ExomeV3_UTR_EZ_HX1_capture.interval_list R=/CGF/Resources/Data/genome/hg19_canonical_correct_chr_order.fa

Expected behavior

Some fields like HS_LIBRARY_SIZE ,LIBRARY is empty.

Actual behavior

Tell us what happens instead

BAIT_SET GENOME_SIZE BAIT_TERRITORY TARGET_TERRITORY BAIT_DESIGN_EFFICIENCY TOTAL_READS PF_READS PF_UNIQUE_READS PCT_PF_READS PCT_PF_UQ_READS PF_UQ_READS_ALIGNED PCT_PF_UQ_READS_ALIGNED PF_BASES_ALIGNED PF_UQ_BASES_ALIGNED ON_BAIT_BASES NEAR_BAIT_BASES OFF_BAIT_BASES ON_TARGET_BASES PCT_SELECTED_BASES PCT_OFF_BAIT ON_BAIT_VS_SELECTED MEAN_BAIT_COVERAGE MEAN_TARGET_COVERAGE MEDIAN_TARGET_COVERAGE MAX_TARGET_COVERAGE PCT_USABLE_BASES_ON_BAIT PCT_USABLE_BASES_ON_TARGET FOLD_ENRICHMENT ZERO_CVG_TARGETS_PCT PCT_EXC_DUPE PCT_EXC_MAPQ PCT_EXC_BASEQ PCT_EXC_OVERLAP PCT_EXC_OFF_TARGET FOLD_80_BASE_PENALTY PCT_TARGET_BASES_1X PCT_TARGET_BASES_2X PCT_TARGET_BASES_10X PCT_TARGET_BASES_20X PCT_TARGET_BASES_30X PCT_TARGET_BASES_40X PCT_TARGET_BASES_50X PCT_TARGET_BASES_100X HS_LIBRARY_SIZE HS_PENALTY_10X HS_PENALTY_20X HS_PENALTY_30X HS_PENALTY_40X HS_PENALTY_50X HS_PENALTY_100X AT_DROPOUT GC_DROPOUT HET_SNP_SENSITIVITY HET_SNP_Q SAMPLE LIBRARY READ_GROUP
120430_HG19_ExomeV3_UTR_EZ_HX1_bait 3.1E+09 98425838 98425838 1 53047326 53047326 53047326 1 1 53047326 1 7.54E+09 7.54E+09 5.05E+09 1.57E+09 9.2E+08 4.17E+09 0.878063 0.121937 0.762345 51.28444 42.37724 37 774 0.66311 0.54794 21.05357 0.013395 0 0.001105 0.008042 0.142204 0.850092 2.492779 0.984894 0.976174 0.892355 0.759401 0.607657 0.461282 0.336774 0.048275   0 0 0 0 0 0 0.033959 13.27583 0.967847 15      

Thanks for helping, Wen

nh13 commented 7 years ago

@wenluo711 I would guess that you have no duplicates marked, is that correct? If so, then the library size estimate cannot be estimated. The SAMPLE, LIBRARY, and READ_GROUP fields are blank if the METRIC_ACCUMULATION_LEVEL option is not set (defaults to use all the reads), so that is expected.

wenluo711 commented 7 years ago

Thanks nh13. I got the HS_LIBRARY_SIZE after running markDuplicates. Based on the explanation of these two HS_LIBRARY_SIZE & ESTIMATED_LIBRARY_SIZE. The former seems to be a subset of selected latter library. Why my HS_LIBRARY_SIZE is larger than ESTIMATED_LIBRARY_SIZE?

Thanks, Wen

nh13 commented 7 years ago

@wenluo711 that's probably a better question for the forums: gatkforums.broadinstitute.org