faircloth-lab / phyluce

software for UCE (and general) phylogenomics
http://phyluce.readthedocs.org/
Other
78 stars 49 forks source link

phyluce_assembly_get_trinity_coverage_for_uce_loci raise IOError("Picard metrics file for {} has more than two lines".format(sample)) #148

Closed KNx5 closed 3 years ago

KNx5 commented 5 years ago

Greetings, Been muddling through the various gatk issues with getting coverage metrics, but now have this error with my output from phyluce_assembly_get_trinity_coverage that is fed into get_trinity_coverage_for_uce_loci . I've included the input and error below, as well as the directory for this taxon and the picard metrics output. Any ideas? Thanks!

2019-01-13 10:15:20,265 - phyluce_assembly_get_trinity_coverage_for_uce_loci - INFO - Calculating coverage metrics for Albuquerquea_B4
Traceback (most recent call last):
  File "/home/feenerlab/miniconda2/bin/phyluce_assembly_get_trinity_coverage_for_uce_loci", line 331, in <module>
    main()
  File "/home/feenerlab/miniconda2/bin/phyluce_assembly_get_trinity_coverage_for_uce_loci", line 318, in main
    on_target_dict = picard.get_percent_reads_on_target(log, hs_metrics_file, organism)
  File "/home/feenerlab/miniconda2/lib/python2.7/site-packages/phyluce/picard.py", line 184, in get_percent_reads_on_target
    raise IOError("Picard metrics file for {} has more than two lines".format(sample))
IOError: Picard metrics file for Albuquerquea_B4 has more than two lines

Here is the output from the first script

[feenerlab@genassmblr Coverage]$ ll /home/feenerlab/katinkaX/DipteraUCEs/Trinity/Coverage/trinity_assemblies_lane1/Albuquerquea_B4/
total 114192
-rw-rw-r--. 1 feenerlab feenerlab     4355 Jan  2 11:29 Albuquerquea_B4.all.picard-MD-out.log
-rw-rw-r--. 1 feenerlab feenerlab     3199 Jan  2 11:29 Albuquerquea_B4.all.picard-metricsfile.txt
-rw-rw-r--. 1 feenerlab feenerlab 51170132 Jan  2 11:29 Albuquerquea_B4-CL-RG-M-MD.bam
-rw-rw-r--. 1 feenerlab feenerlab  1511912 Jan  2 11:29 Albuquerquea_B4-CL-RG-M-MD.bam.bai
-rw-rw-r--. 1 feenerlab feenerlab 24970841 Dec 19 12:32 Albuquerquea_B4-coverage.gz
-rw-rw-r--. 1 feenerlab feenerlab     1884 Jan  2 11:29 Albuquerquea_B4.GATK-coverage-out.log
-rw-rw-r--. 1 feenerlab feenerlab     1658 Jan  2 11:28 Albuquerquea_B4.pe.bwa-sampe-out.log
-rw-rw-r--. 1 feenerlab feenerlab     1715 Jan  2 11:28 Albuquerquea_B4.pe.picard-clean-out.log
-rw-rw-r--. 1 feenerlab feenerlab     2150 Jan  2 11:28 Albuquerquea_B4.pe.picard-RG-out.log
-rw-rw-r--. 1 feenerlab feenerlab      131 Jan  2 11:28 Albuquerquea_B4.pe.samtools-view-out.log
-rw-rw-r--. 1 feenerlab feenerlab     2349 Jan  2 11:28 Albuquerquea_B4.picard-merge-out.log
-rw-rw-r--. 1 feenerlab feenerlab     2443 Jan  2 11:27 Albuquerquea_B4.picard-reference-dict-out.log
-rw-rw-r--. 1 feenerlab feenerlab        0 Jan  2 11:27 Albuquerquea_B4.samtools-faidx-out.log
-rw-rw-r--. 1 feenerlab feenerlab        0 Jan  2 11:29 Albuquerquea_B4.samtools-index-out.log
-rw-rw-r--. 1 feenerlab feenerlab      539 Jan  2 11:28 Albuquerquea_B4.se.bwa-sampe-out.log
-rw-rw-r--. 1 feenerlab feenerlab     1727 Jan  2 11:28 Albuquerquea_B4.se.picard-clean-out.log
-rw-rw-r--. 1 feenerlab feenerlab     2161 Jan  2 11:28 Albuquerquea_B4.se.picard-RG-out.log
-rw-rw-r--. 1 feenerlab feenerlab       50 Jan  2 11:28 Albuquerquea_B4.se.samtools-view-out.log
-rw-rw-r--. 1 feenerlab feenerlab   567241 Dec 19 12:31 Albuquerquea_B4-UNTRIMMED-per-contig-coverage.txt
-rw-rw-r--. 1 feenerlab feenerlab     7856 Jan  2 11:27 bwa-index-file.log
-rw-rw-r--. 1 feenerlab feenerlab  4056721 Dec 17 16:57 contigs.dict
-rwxrwxr-x. 1 feenerlab feenerlab 13638611 Dec 11 12:38 contigs.fasta
-rw-rw-r--. 1 feenerlab feenerlab       16 Jan  2 11:27 contigs.fasta.amb
-rw-rw-r--. 1 feenerlab feenerlab  5353072 Jan  2 11:27 contigs.fasta.ann
-rw-rw-r--. 1 feenerlab feenerlab  8463728 Jan  2 11:27 contigs.fasta.bwt
-rw-rw-r--. 1 feenerlab feenerlab   760192 Jan  2 11:27 contigs.fasta.fai
-rw-rw-r--. 1 feenerlab feenerlab  2115908 Jan  2 11:27 contigs.fasta.pac
-rw-rw-r--. 1 feenerlab feenerlab  4231864 Jan  2 11:27 contigs.fasta.sa

And within Albuquerquea_B4.all.picard-metricsfile.txt

[feenerlab@genassmblr Coverage]$ more /home/feenerlab/katinkaX/DipteraUCEs/Trinity/Coverage/trinity_assemblies_lane1/Albuquerquea_B4/Albuquerquea_B4.all.picard-metricsfile.txt 
## htsjdk.samtools.metrics.StringHeader
# MarkDuplicates MAX_FILE_HANDLES_FOR_READ_ENDS_MAP=250 INPUT=[/home/feenerlab/katinkaX/DipteraUCEs/Trinity/Coverage/trinity_assemblies_lane1/Albuquerquea_B4/Albuquerquea_B4-CL-RG-M
.bam] OUTPUT=/home/feenerlab/katinkaX/DipteraUCEs/Trinity/Coverage/trinity_assemblies_lane1/Albuquerquea_B4/Albuquerquea_B4-CL-RG-M-MD.bam METRICS_FILE=/home/feenerlab/katinkaX/Dipt
eraUCEs/Trinity/Coverage/trinity_assemblies_lane1/Albuquerquea_B4/Albuquerquea_B4.all.picard-metricsfile.txt REMOVE_DUPLICATES=false ASSUME_SORTED=true VALIDATION_STRINGENCY=SILENT 
   MAX_SEQUENCES_FOR_DISK_READ_ENDS_MAP=50000 SORTING_COLLECTION_SIZE_RATIO=0.25 TAG_DUPLICATE_SET_MEMBERS=false REMOVE_SEQUENCING_DUPLICATES=false TAGGING_POLICY=DontTag CLEAR_DT=t
rue DUPLEX_UMI=false ADD_PG_TAG_TO_READS=true DUPLICATE_SCORING_STRATEGY=SUM_OF_BASE_QUALITIES PROGRAM_RECORD_ID=MarkDuplicates PROGRAM_GROUP_NAME=MarkDuplicates READ_NAME_REGEX=<op
timized capture of last three ':' separated fields as numeric values> OPTICAL_DUPLICATE_PIXEL_DISTANCE=100 MAX_OPTICAL_DUPLICATE_SET_SIZE=300000 VERBOSITY=INFO QUIET=false COMPRESSI
ON_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json USE_JDK_DEFLATER=false USE_JDK_INFLATER=false
## htsjdk.samtools.metrics.StringHeader
# Started on: Wed Jan 02 11:28:54 MST 2019

## METRICS CLASS    picard.sam.DuplicationMetrics
LIBRARY UNPAIRED_READS_EXAMINED READ_PAIRS_EXAMINED SECONDARY_OR_SUPPLEMENTARY_RDS  UNMAPPED_READS  UNPAIRED_READ_DUPLICATES    READ_PAIR_DUPLICATES    READ_PAIR_OPTICAL_DUP
LICATES PERCENT_DUPLICATION ESTIMATED_LIBRARY_SIZE
Lib1    29347   166387  22782   128685  8033    1408    1   0.02996 9782519

## HISTOGRAM    java.lang.Double
BIN VALUE
1.0 1.000006
2.0 1.983147
3.0 2.949708
4.0 3.899967
5.0 4.834201
6.0 5.75268
KNx5 commented 5 years ago

After digging into the picard.py script, I realize that the output file that this is erroring on is the Anisopodidae_A5.reads-on-target.txt generated by /phyluce_assembly_get_trinity_coverage_for_uce_loci Here's what that looks like:

lane1_coverageXlocus_output]$ more Anisopodidae_A5.reads-on-target.txt 
## htsjdk.samtools.metrics.StringHeader
# CollectHsMetrics BAIT_INTERVALS=[/home/feenerlab/katinkaX/DipteraUCEs/Trinity/Coverage/lane1_coverageXlocus_output/Anisopodidae_A5-UCE-matches-interval.list] TARGET_INTERVALS=[/ho
me/feenerlab/katinkaX/DipteraUCEs/Trinity/Coverage/lane1_coverageXlocus_output/Anisopodidae_A5-UCE-matches-interval.list] INPUT=/home/feenerlab/katinkaX/DipteraUCEs/Trinity/Coverage
/trinity_assemblies_lane1/Anisopodidae_A5/Anisopodidae_A5-CL-RG-M-MD.bam OUTPUT=/home/feenerlab/katinkaX/DipteraUCEs/Trinity/Coverage/lane1_coverageXlocus_output/Anisopodidae_A5.rea
ds-on-target.txt VALIDATION_STRINGENCY=LENIENT REFERENCE_SEQUENCE=/home/feenerlab/katinkaX/DipteraUCEs/Trinity/Coverage/trinity_assemblies_lane1/Anisopodidae_A5/contigs.fasta    MET
RIC_ACCUMULATION_LEVEL=[ALL_READS] NEAR_DISTANCE=250 MINIMUM_MAPPING_QUALITY=20 MINIMUM_BASE_QUALITY=20 CLIP_OVERLAPPING_READS=true COVERAGE_CAP=200 SAMPLE_SIZE=10000 ALLELE_FRACTIO
N=[0.001, 0.005, 0.01, 0.02, 0.05, 0.1, 0.2, 0.3, 0.5] VERBOSITY=INFO QUIET=false COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT
_SECRETS=client_secrets.json USE_JDK_DEFLATER=false USE_JDK_INFLATER=false
## htsjdk.samtools.metrics.StringHeader
# Started on: Mon Jan 14 09:02:11 MST 2019

## METRICS CLASS    picard.analysis.directed.HsMetrics
BAIT_SET    GENOME_SIZE BAIT_TERRITORY  TARGET_TERRITORY    BAIT_DESIGN_EFFICIENCY  TOTAL_READS PF_READS    PF_UNIQUE_READS PCT_PF_READS    PCT_PF_UQ_READS PF_UQ
_READS_ALIGNED  PCT_PF_UQ_READS_ALIGNED PF_BASES_ALIGNED    PF_UQ_BASES_ALIGNED ON_BAIT_BASES   NEAR_BAIT_BASES OFF_BAIT_BASES  ON_TARGET_BASES PCT_SELECTED_BASES  PCT_O
FF_BAIT ON_BAIT_VS_SELECTED MEAN_BAIT_COVERAGE  MEAN_TARGET_COVERAGE    MEDIAN_TARGET_COVERAGE  MAX_TARGET_COVERAGE PCT_USABLE_BASES_ON_BAIT    PCT_USABLE_BASES_ON_T
ARGET   FOLD_ENRICHMENT ZERO_CVG_TARGETS_PCT    PCT_EXC_DUPE    PCT_EXC_MAPQ    PCT_EXC_BASEQ   PCT_EXC_OVERLAP PCT_EXC_OFF_TARGET  FOLD_80_BASE_PENALTY    PCT_TARGET_BASES_1X PCT_TARGET_BASES_2X PCT_TARGET_BASES_10X    PCT_TARGET_BASES_20X    PCT_TARGET_BASES_30X    PCT_TARGET_BASES_40X    PCT_TARGET_BASES_50X    PCT_TARGET_BASES_100X   HS_LIBRARY_SI
ZE  HS_PENALTY_10X  HS_PENALTY_20X  HS_PENALTY_30X  HS_PENALTY_40X  HS_PENALTY_50X  HS_PENALTY_100X AT_DROPOUT  GC_DROPOUT  HET_SNP_SENSITIVITY HET_SNP_Q   SAMPL
E   LIBRARY READ_GROUP
Anisopodidae_A5-UCE-matches-interval    18490144    801063  801063  1   3065018 3065018 2903876 1   0.947425    1552485 0.534625    195891302   177127730   21056453    0   174834849   16781663    0.10749 0.89251 1   26.285639   20.949242   8   662 0.055723    0.04441 2.481097    0.000929    0.095786    0.26605 0.00521 0.05787 0.579549    5.237311    0.996935    0.983925    0.455064    0.260751    0.184424    0.138436    0.106184    0.037855    352253  79.743427   -1  -1  -1  -1  -1  8.004739    4.036515    0.872487    9           

## HISTOGRAM    java.lang.Integer
coverage_or_base_quality    high_quality_coverage_count unfiltered_baseq_count
0   2455    0
1   10422   0
2   48365   0
3   70045   0
4   73957   0
5   67324   0
6   54786   0
7   43036   0
8   35286   0
9   30852   0
10  26443   0
11  23004   0
12  19152   0
...

It looks like the python script is expecting only the metrics section and not the histogram section.

Have you come across this before? Is it a GATK/picard version issue?

Thanks!

KNx5 commented 5 years ago

I think I figured out a way around. Edited the picard.py script to ignore the "HISTOGRAM" section here is the changed function

def get_percent_reads_on_target(log, hs_metrics_file, sample):
    lines = []
    with open(hs_metrics_file, "rU") as metrics:
        for line in metrics:
            if "HISTOGRAM" in line:
                break
            elif line.strip().startswith("#") or line.strip() == "":
                pass
            else:
                lines.append(line.strip("\n"))
    try:
        assert len(lines) == 2
    except:
        raise IOError("Picard metrics file for {} has more than two lines".format(sample))
    return dict(zip(lines[0].split("\t"), lines[1].split("\t")))
brantfaircloth commented 5 years ago

I haven't had time to delve into this (as you may have guessed) yet. I know that Michael ran into what sounds like a similar issue in #142.

brantfaircloth commented 4 years ago

hi @quattrinia: did you figure out the issues - I see a comment from you (in my email) but don't find it here...

quattrinia commented 4 years ago

Sorry. I thought that I might have screwed up symlinks to my files, so I deleted my comment, but it is not the problem. It is strange. This script will work on some of my contigs, but not others. I cannot figure it out....ALso, wondering, if you can use this file output ALC01-UCE-per-contig-coverage.txt to calculate locus coverage

phyluce_assembly_get_trinity_coverage_for_uce_loci --assemblies ./cov --match-count-output ../phyluceTutorial1/taxon-sets/all/all-taxa-incomplete.conf --locus-db ../phyluceTutorial1/uce-search-results/probe.matches.sqlite --output uce-coverage-test --type untrimmed [WARNING] Output directory exists, REMOVE [Y/n]? Y 2019-11-14 17:16:47,689 - phyluce_assembly_get_trinity_coverage_for_uce_loci - INFO - == Starting phyluce_assembly_get_trinity_coverage_for_uce_loci == 2019-11-14 17:16:47,689 - phyluce_assembly_get_trinity_coverage_for_uce_loci - INFO - Version: 1.5.0 2019-11-14 17:16:47,689 - phyluce_assembly_get_trinity_coverage_for_uce_loci - INFO - Argument --assemblies: /data/mcfadden/kerickson/spades-assemblies/cov 2019-11-14 17:16:47,690 - phyluce_assembly_get_trinity_coverage_for_uce_loci - INFO - Argument --exclude: None 2019-11-14 17:16:47,690 - phyluce_assembly_get_trinity_coverage_for_uce_loci - INFO - Argument --locus_db: /data/mcfadden/kerickson/phyluceTutorial1/uce-search-results/probe.matches.sqlite 2019-11-14 17:16:47,690 - phyluce_assembly_get_trinity_coverage_for_uce_loci - INFO - Argument --log_path: None 2019-11-14 17:16:47,690 - phyluce_assembly_get_trinity_coverage_for_uce_loci - INFO - Argument --match_count_output: /data/mcfadden/kerickson/phyluceTutorial1/taxon-sets/all/all-taxa-incomplete.conf 2019-11-14 17:16:47,690 - phyluce_assembly_get_trinity_coverage_for_uce_loci - INFO - Argument --output: /data/mcfadden/kerickson/spades-assemblies/uce-coverage-test 2019-11-14 17:16:47,690 - phyluce_assembly_get_trinity_coverage_for_uce_loci - INFO - Argument --resume: None 2019-11-14 17:16:47,690 - phyluce_assembly_get_trinity_coverage_for_uce_loci - INFO - Argument --type: untrimmed 2019-11-14 17:16:47,691 - phyluce_assembly_get_trinity_coverage_for_uce_loci - INFO - Argument --verbosity: INFO 2019-11-14 17:16:47,691 - phyluce_assembly_get_trinity_coverage_for_uce_loci - INFO - Creating the output directory 2019-11-14 17:16:47,691 - phyluce_assembly_get_trinity_coverage_for_uce_loci - INFO - Fetching input filenames 2019-11-14 17:16:47,692 - phyluce_assembly_get_trinity_coverage_for_uce_loci - INFO - Fetching loci from all-taxa-incomplete.conf 2019-11-14 17:16:47,745 - phyluce_assembly_get_trinity_coverage_for_uce_loci - INFO - ------------------------ Processing ALC01 ----------------------- 2019-11-14 17:16:47,745 - phyluce_assembly_get_trinity_coverage_for_uce_loci - INFO - Fetching contig names from from probe.matches.sqlite 2019-11-14 17:16:47,773 - phyluce_assembly_get_trinity_coverage_for_uce_loci - INFO - Generating (untrimmed) per-base coverage file for UCE loci. 2019-11-14 17:20:07,993 - phyluce_assembly_get_trinity_coverage_for_uce_loci - INFO - Generating per-contig coverage file and interval list for UCE loci. 2019-11-14 17:20:08,228 - phyluce_assembly_get_trinity_coverage_for_uce_loci - INFO - Calculating coverage metrics for ALC01 2019-11-14 17:20:46,577 - phyluce_assembly_get_trinity_coverage_for_uce_loci - INFO - 1951 contigs, mean trimmed length = 1228.1, mean trimmed coverage = 52.7x, on-target bases (uce contigs) = 42.3%, unique reads aligned (all contigs) = 53.0% 2019-11-14 17:20:46,578 - phyluce_assembly_get_trinity_coverage_for_uce_loci - INFO - ------------------------ Processing ALC10 ----------------------- 2019-11-14 17:20:46,578 - phyluce_assembly_get_trinity_coverage_for_uce_loci - INFO - Fetching contig names from from probe.matches.sqlite 2019-11-14 17:20:46,616 - phyluce_assembly_get_trinity_coverage_for_uce_loci - INFO - Generating (untrimmed) per-base coverage file for UCE loci. 2019-11-14 17:23:52,312 - phyluce_assembly_get_trinity_coverage_for_uce_loci - INFO - Generating per-contig coverage file and interval list for UCE loci. 2019-11-14 17:23:52,509 - phyluce_assembly_get_trinity_coverage_for_uce_loci - INFO - Calculating coverage metrics for ALC10 Traceback (most recent call last): File "/data/mcfadden/aquattrini/PROGRAMS/anaconda2/bin/phyluce_assembly_get_trinity_coverage_for_uce_loci", line 311, in main() File "/data/mcfadden/aquattrini/PROGRAMS/anaconda2/bin/phyluce_assembly_get_trinity_coverage_for_uce_loci", line 298, in main on_target_dict = picard.get_percent_reads_on_target(log, hs_metrics_file, organism) File "/data/mcfadden/aquattrini/PROGRAMS/anaconda2/lib/python2.7/site-packages/phyluce/picard.py", line 193, in get_percent_reads_on_target with open(hs_metrics_file, "rU") as metrics: IOError: [Errno 2] No such file or directory: '/data/mcfadden/kerickson/spades-assemblies/uce-coverage-test/ALC10.reads-on-target.txt'

Here is the picard file for the failed specimen

[Thu Nov 14 17:23:52 PST 2019] net.sf.picard.analysis.directed.CalculateHsMetrics BAIT_INTERVALS=/data/mcfadden/kerickson/spades-assemblies/uce-coverage-test/ALC10-UCE-matches-interval.list TARGET_INTERVALS=/data/mcfadden/kerickson/spades-assemblies/uce-coverage-test/ALC10-UCE-matches-interval.list INPUT=/data/mcfadden/kerickson/spades-assemblies/cov/ALC10/ALC10-CL-RG-M-MD.bam OUTPUT=/data/mcfadden/kerickson/spades-assemblies/uce-coverage-test/ALC10.reads-on-target.txt REFERENCE_SEQUENCE=/data/mcfadden/kerickson/spades-assemblies/cov/ALC10/contigs.fasta METRIC_ACCUMULATION_LEVEL=[ALL_READS] VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false [Thu Nov 14 17:23:52 PST 2019] Executing as smoaleman@purves on Linux 4.4.0-165-generic amd64; Java HotSpot(TM) 64-Bit Server VM 1.7.0_02-b13; Picard version: 1.106(1655) JdkDeflater [Thu Nov 14 17:24:02 PST 2019] net.sf.picard.analysis.directed.CalculateHsMetrics done. Elapsed time: 0.16 minutes. Runtime.totalMemory()=4378001408 To get help, see http://picard.sourceforge.net/index.shtml#GettingHelp Exception in thread "main" net.sf.samtools.SAMFormatException: SAM validation error: ERROR: Record 1, Read name A00572:21:H3WK7DRXX:2:1149:10737:8218, Alignment start should != 0 because reference name != *. at net.sf.samtools.SAMUtils.processValidationErrors(SAMUtils.java:448) at net.sf.samtools.BAMFileReader$BAMFileIterator.advance(BAMFileReader.java:621) at net.sf.samtools.BAMFileReader$BAMFileIterator.(BAMFileReader.java:594) at net.sf.samtools.BAMFileReader$BAMFileIterator.(BAMFileReader.java:582) at net.sf.samtools.BAMFileReader.getIterator(BAMFileReader.java:294) at net.sf.samtools.SAMFileReader.iterator(SAMFileReader.java:336) at net.sf.picard.analysis.directed.CollectTargetedMetrics.doWork(CollectTargetedMetrics.java:118) at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:179) at net.sf.picard.analysis.directed.CalculateHsMetrics.main(CalculateHsMetrics.java:74)

brantfaircloth commented 4 years ago

I’ll take a look... I found something yesterday but don’t have the link handy on my phone. Could you also dig out the SAM/BAM causing problems and run Picard’s ValidateSam against it? That may help us diagnose the issue.

brantfaircloth commented 4 years ago

Two links to check (I'm at work now):

http://seqanswers.com/forums/showthread.php?t=39357 https://gatkforums.broadinstitute.org/gatk/discussion/7618/picard-alignment-start-should-0-because-reference-name

let's see what ValidateSamFile has to say...

quattrinia commented 4 years ago

Validate Sam File Results java -jar picard.jar ValidateSamFile I=/data/mcfadden/kerickson/spades-assemblies/ALC10_spades/ALC10-CL-RG-M-MD.bam [Fri Nov 15 09:25:30 PST 2019] picard.sam.ValidateSamFile INPUT=/data/mcfadden/kerickson/spades-assemblies/ALC10_spades/ALC10-CL-RG-M-MD.bam MODE=VERBOSE MAX_OUTPUT=100 IGNORE_WARNINGS=false VALIDATE_INDEX=true INDEX_VALIDATION_STRINGENCY=EXHAUSTIVE IS_BISULFITE_SEQUENCED=false MAX_OPEN_TEMP_FILES=8000 VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json [Fri Nov 15 09:25:30 PST 2019] Executing as aquattrini@purves on Linux 4.4.0-165-generic amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_65-b17; Picard version: 2.1.1(6a5237c0f295ddce209ee3a3a5b83a3779408b1b_1457101272) IntelDeflater ERROR: Record 1, Read name A00572:21:H3WK7DRXX:2:1149:10737:8218, Alignment start should != 0 because reference name != . ERROR: Record 2, Read name A00572:21:H3WK7DRXX:2:1164:26576:28479, Mate Alignment start should != 0 because reference name != . ERROR: Record 2, Read name A00572:21:H3WK7DRXX:2:1164:26576:28479, Alignment start should != 0 because reference name != . ERROR: Record 3, Read name A00572:21:H3WK7DRXX:2:1164:26576:28479, Mate Alignment start should != 0 because reference name != . ERROR: Record 3, Read name A00572:21:H3WK7DRXX:2:1164:26576:28479, Alignment start should != 0 because reference name != . ERROR: Record 4, Read name A00572:21:H3WK7DRXX:2:1274:1922:33004, Mate Alignment start should != 0 because reference name != . ERROR: Record 4, Read name A00572:21:H3WK7DRXX:2:1274:1922:33004, Alignment start should != 0 because reference name != . ERROR: Record 5, Read name A00572:21:H3WK7DRXX:2:1274:1922:33004, Mate Alignment start should != 0 because reference name != . ERROR: Record 5, Read name A00572:21:H3WK7DRXX:2:1274:1922:33004, Alignment start should != 0 because reference name != . ERROR: Record 6, Read name A00572:21:H3WK7DRXX:2:1160:30490:3004, Mate Alignment start should != 0 because reference name != . ERROR: Record 6, Read name A00572:21:H3WK7DRXX:2:1160:30490:3004, Alignment start should != 0 because reference name != . ERROR: Record 7, Read name A00572:21:H3WK7DRXX:2:1160:30490:3004, Mate Alignment start should != 0 because reference name != . ERROR: Record 7, Read name A00572:21:H3WK7DRXX:2:1160:30490:3004, Alignment start should != 0 because reference name != . ERROR: Record 7, Read name A00572:21:H3WK7DRXX:2:1160:30490:3004, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 8, Read name A00572:21:H3WK7DRXX:2:1255:9796:12320, Mate Alignment start should != 0 because reference name != . ERROR: Record 8, Read name A00572:21:H3WK7DRXX:2:1255:9796:12320, Alignment start should != 0 because reference name != . ERROR: Record 9, Read name A00572:21:H3WK7DRXX:2:1255:9796:12320, Mate Alignment start should != 0 because reference name != . ERROR: Record 9, Read name A00572:21:H3WK7DRXX:2:1255:9796:12320, Alignment start should != 0 because reference name != . ERROR: Record 8, Read name A00572:21:H3WK7DRXX:2:1255:9796:12320, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 10, Read name A00572:21:H3WK7DRXX:2:1277:24207:30984, Mate Alignment start should != 0 because reference name != . ERROR: Record 10, Read name A00572:21:H3WK7DRXX:2:1277:24207:30984, Alignment start should != 0 because reference name != . ERROR: Record 11, Read name A00572:21:H3WK7DRXX:2:1277:24207:30984, Mate Alignment start should != 0 because reference name != . ERROR: Record 11, Read name A00572:21:H3WK7DRXX:2:1277:24207:30984, Alignment start should != 0 because reference name != *. ERROR: Record 10, Read name A00572:21:H3WK7DRXX:2:1277:24207:30984, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 37, Read name A00572:21:H3WK7DRXX:2:1241:30608:6684, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 39, Read name A00572:21:H3WK7DRXX:2:2163:31485:36417, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 74, Read name A00572:21:H3WK7DRXX:2:2255:29396:4178, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 76, Read name A00572:21:H3WK7DRXX:2:2255:29785:4883, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 123, Read name A00572:21:H3WK7DRXX:2:1111:5276:17957, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 125, Read name A00572:21:H3WK7DRXX:2:2176:28736:18568, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 127, Read name A00572:21:H3WK7DRXX:2:2247:26106:14168, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 135, Read name A00572:21:H3WK7DRXX:2:2173:14552:8469, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 137, Read name A00572:21:H3WK7DRXX:2:2233:1262:23657, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 139, Read name A00572:21:H3WK7DRXX:2:2239:29487:20682, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 144, Read name A00572:21:H3WK7DRXX:2:2214:21748:7498, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 146, Read name A00572:21:H3WK7DRXX:2:2229:29740:4147, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 181, Read name A00572:21:H3WK7DRXX:2:2161:5367:5431, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 187, Read name A00572:21:H3WK7DRXX:2:1255:4020:16689, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 189, Read name A00572:21:H3WK7DRXX:2:2228:25898:1752, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 191, Read name A00572:21:H3WK7DRXX:2:2228:26106:1204, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 230, Read name A00572:21:H3WK7DRXX:2:2175:18620:1423, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 232, Read name A00572:21:H3WK7DRXX:2:2261:31638:4366, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 237, Read name A00572:21:H3WK7DRXX:2:2177:28592:7263, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 239, Read name A00572:21:H3WK7DRXX:2:2177:28999:8124, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 463, Read name A00572:21:H3WK7DRXX:2:1219:5565:30201, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 465, Read name A00572:21:H3WK7DRXX:2:1229:23023:26115, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 467, Read name A00572:21:H3WK7DRXX:2:1267:32226:22545, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 469, Read name A00572:21:H3WK7DRXX:2:2119:11342:32221, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 471, Read name A00572:21:H3WK7DRXX:2:2172:22580:17957, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 491, Read name A00572:21:H3WK7DRXX:2:2113:5348:30796, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 493, Read name A00572:21:H3WK7DRXX:2:2206:28818:19930, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 495, Read name A00572:21:H3WK7DRXX:2:2214:23782:36323, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 497, Read name A00572:21:H3WK7DRXX:2:2245:24352:31454, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 587, Read name A00572:21:H3WK7DRXX:2:1267:29794:12164, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 740, Read name A00572:21:H3WK7DRXX:2:1146:32307:3646, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 745, Read name A00572:21:H3WK7DRXX:2:2169:13756:27352, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 768, Read name A00572:21:H3WK7DRXX:2:1105:24749:14105, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 770, Read name A00572:21:H3WK7DRXX:2:2143:3513:25457, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 779, Read name A00572:21:H3WK7DRXX:2:2138:8332:11725, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 842, Read name A00572:21:H3WK7DRXX:2:1165:20672:15436, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 844, Read name A00572:21:H3WK7DRXX:2:1243:1651:1564, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 846, Read name A00572:21:H3WK7DRXX:2:2140:10890:16000, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 919, Read name A00572:21:H3WK7DRXX:2:1102:27941:1407, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 970, Read name A00572:21:H3WK7DRXX:2:2209:14651:17033, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 988, Read name A00572:21:H3WK7DRXX:2:1246:8404:8437, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 1035, Read name A00572:21:H3WK7DRXX:2:1225:5728:9846, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 1043, Read name A00572:21:H3WK7DRXX:2:2274:9661:16720, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 1160, Read name A00572:21:H3WK7DRXX:2:1171:27317:11631, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 1423, Read name A00572:21:H3WK7DRXX:2:1277:26106:18427, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 1425, Read name A00572:21:H3WK7DRXX:2:2147:5710:7091, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 1480, Read name A00572:21:H3WK7DRXX:2:1217:10655:29997, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 1662, Read name A00572:21:H3WK7DRXX:2:2142:3893:32095, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 1759, Read name A00572:21:H3WK7DRXX:2:1275:19126:8844, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 1761, Read name A00572:21:H3WK7DRXX:2:2123:18882:8641, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 1763, Read name A00572:21:H3WK7DRXX:2:2145:16776:9126, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 1765, Read name A00572:21:H3WK7DRXX:2:2145:17300:8844, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 1820, Read name A00572:21:H3WK7DRXX:2:1103:1353:23218, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 1822, Read name A00572:21:H3WK7DRXX:2:2172:12057:2300, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 1824, Read name A00572:21:H3WK7DRXX:2:2215:32271:35399, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 1826, Read name A00572:21:H3WK7DRXX:2:2277:29324:23500, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 1845, Read name A00572:21:H3WK7DRXX:2:1267:1904:30217, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 1866, Read name A00572:21:H3WK7DRXX:2:2266:10936:12289, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 1872, Read name A00572:21:H3WK7DRXX:2:1106:17942:2910, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 1874, Read name A00572:21:H3WK7DRXX:2:2135:23656:30123, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 1876, Read name A00572:21:H3WK7DRXX:2:2154:10086:17738, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 1879, Read name A00572:21:H3WK7DRXX:2:1122:7536:19836, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 1881, Read name A00572:21:H3WK7DRXX:2:2131:13557:20525, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 1883, Read name A00572:21:H3WK7DRXX:2:2136:8983:24878, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 1887, Read name A00572:21:H3WK7DRXX:2:1271:11740:30185, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 1891, Read name A00572:21:H3WK7DRXX:2:1256:19958:10911, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 1893, Read name A00572:21:H3WK7DRXX:2:2128:3540:27164, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 1942, Read name A00572:21:H3WK7DRXX:2:2105:31891:7874, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 1945, Read name A00572:21:H3WK7DRXX:2:1127:31665:8516, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 1947, Read name A00572:21:H3WK7DRXX:2:1202:26964:4852, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 1949, Read name A00572:21:H3WK7DRXX:2:2201:15176:30718, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 1961, Read name A00572:21:H3WK7DRXX:2:2115:6958:2863, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 2031, Read name A00572:21:H3WK7DRXX:2:2262:27697:20494, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 2035, Read name A00572:21:H3WK7DRXX:2:1206:3396:26757, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 2047, Read name A00572:21:H3WK7DRXX:2:2126:11577:6543, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 2049, Read name A00572:21:H3WK7DRXX:2:2146:29197:34240, Mate negative strand flag does not match read negative strand flag of mate Maximum output of [100] errors reached. [Fri Nov 15 09:25:32 PST 2019] picard.sam.ValidateSamFile done. Elapsed time: 0.03 minutes. Runtime.totalMemory()=2058354688

quattrinia commented 4 years ago

For the file that worked, here are validateSam results Fri Nov 15 09:28:37 PST 2019] Executing as aquattrini@purves on Linux 4.4.0-165-generic amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_65-b17; Picard version: 2.1.1(6a5237c0f295ddce209ee3a3a5b83a3779408b1b_1457101272) IntelDeflater ERROR: Record 1, Read name A00572:21:H3WK7DRXX:2:1239:6524:28072, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 3, Read name A00572:21:H3WK7DRXX:2:1266:13801:32158, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 5, Read name A00572:21:H3WK7DRXX:2:2235:22806:16157, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 8, Read name A00572:21:H3WK7DRXX:2:2236:19253:33959, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 10, Read name A00572:21:H3WK7DRXX:2:2237:30888:12023, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 32, Read name A00572:21:H3WK7DRXX:2:1128:7111:8202, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 34, Read name A00572:21:H3WK7DRXX:2:1207:18159:21386, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 36, Read name A00572:21:H3WK7DRXX:2:2121:7509:19132, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 38, Read name A00572:21:H3WK7DRXX:2:2128:24831:13495, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 51, Read name A00572:21:H3WK7DRXX:2:1204:12825:2190, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 53, Read name A00572:21:H3WK7DRXX:2:1276:22679:4413, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 55, Read name A00572:21:H3WK7DRXX:2:1277:3658:33849, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 58, Read name A00572:21:H3WK7DRXX:2:2231:7392:16235, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 62, Read name A00572:21:H3WK7DRXX:2:2276:9787:22200, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 107, Read name A00572:21:H3WK7DRXX:2:1253:11767:1799, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 337, Read name A00572:21:H3WK7DRXX:2:2148:1570:12446, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 555, Read name A00572:21:H3WK7DRXX:2:2146:6804:16000, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 659, Read name A00572:21:H3WK7DRXX:2:2122:20989:36464, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 701, Read name A00572:21:H3WK7DRXX:2:2114:9353:35697, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 730, Read name A00572:21:H3WK7DRXX:2:2212:12400:19022, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 974, Read name A00572:21:H3WK7DRXX:2:2125:24379:18787, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 1105, Read name A00572:21:H3WK7DRXX:2:1124:5258:4053, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 1123, Read name A00572:21:H3WK7DRXX:2:2110:7627:9502, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 1184, Read name A00572:21:H3WK7DRXX:2:2212:21486:25238, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 1186, Read name A00572:21:H3WK7DRXX:2:1151:22209:19257, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 1208, Read name A00572:21:H3WK7DRXX:2:2241:32832:4116, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 1588, Read name A00572:21:H3WK7DRXX:2:1252:24144:33348, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 1985, Read name A00572:21:H3WK7DRXX:2:1275:30355:36401, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 1987, Read name A00572:21:H3WK7DRXX:2:1275:30626:36338, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 2131, Read name A00572:21:H3WK7DRXX:2:1119:16902:13510, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 2423, Read name A00572:21:H3WK7DRXX:2:2134:14724:28745, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 2425, Read name A00572:21:H3WK7DRXX:2:2266:18665:1720, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 2448, Read name A00572:21:H3WK7DRXX:2:2217:30707:14747, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 2656, Read name A00572:21:H3WK7DRXX:2:1109:32199:4523, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 2660, Read name A00572:21:H3WK7DRXX:2:2242:18457:28260, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 2662, Read name A00572:21:H3WK7DRXX:2:2242:3586:17033, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 2664, Read name A00572:21:H3WK7DRXX:2:2242:4381:15123, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 2734, Read name A00572:21:H3WK7DRXX:2:1250:16423:30874, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 2883, Read name A00572:21:H3WK7DRXX:2:1252:21197:20290, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 2885, Read name A00572:21:H3WK7DRXX:2:2211:26784:21042, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 2887, Read name A00572:21:H3WK7DRXX:2:2105:18177:35321, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 3303, Read name A00572:21:H3WK7DRXX:2:1106:1759:25332, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 3305, Read name A00572:21:H3WK7DRXX:2:1244:14127:22013, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 3529, Read name A00572:21:H3WK7DRXX:2:2101:1687:5760, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 3581, Read name A00572:21:H3WK7DRXX:2:1172:24279:3427, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 3583, Read name A00572:21:H3WK7DRXX:2:1172:24325:3630, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 3608, Read name A00572:21:H3WK7DRXX:2:2206:27471:25739, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 3674, Read name A00572:21:H3WK7DRXX:2:1153:19144:6590, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 3676, Read name A00572:21:H3WK7DRXX:2:2169:23158:11381, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 3794, Read name A00572:21:H3WK7DRXX:2:1240:17689:6042, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 3797, Read name A00572:21:H3WK7DRXX:2:2169:25852:1297, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 3938, Read name A00572:21:H3WK7DRXX:2:2111:25030:27931, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 3942, Read name A00572:21:H3WK7DRXX:2:2152:8585:9157, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 3991, Read name A00572:21:H3WK7DRXX:2:1235:19524:30702, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4204, Read name A00572:21:H3WK7DRXX:2:2177:11758:6637, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4247, Read name A00572:21:H3WK7DRXX:2:1133:8630:20948, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4249, Read name A00572:21:H3WK7DRXX:2:1154:6406:10144, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4251, Read name A00572:21:H3WK7DRXX:2:1239:12301:34256, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4253, Read name A00572:21:H3WK7DRXX:2:1265:24704:23954, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4255, Read name A00572:21:H3WK7DRXX:2:2257:7554:33426, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4266, Read name A00572:21:H3WK7DRXX:2:2159:11921:8954, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4324, Read name A00572:21:H3WK7DRXX:2:1151:16034:14325, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4332, Read name A00572:21:H3WK7DRXX:2:1265:24062:25661, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4334, Read name A00572:21:H3WK7DRXX:2:2162:21133:36432, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4336, Read name A00572:21:H3WK7DRXX:2:2260:3622:1188, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4403, Read name A00572:21:H3WK7DRXX:2:2209:7618:31501, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4405, Read name A00572:21:H3WK7DRXX:2:2244:30590:36088, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4496, Read name A00572:21:H3WK7DRXX:2:1174:9778:23876, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4499, Read name A00572:21:H3WK7DRXX:2:1244:10845:32643, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4501, Read name A00572:21:H3WK7DRXX:2:1244:11550:32675, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4504, Read name A00572:21:H3WK7DRXX:2:2113:13838:32002, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4505, Read name A00572:21:H3WK7DRXX:2:2242:1931:16548, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4511, Read name A00572:21:H3WK7DRXX:2:1202:18123:17848, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4518, Read name A00572:21:H3WK7DRXX:2:1108:17381:12680, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4520, Read name A00572:21:H3WK7DRXX:2:1247:18376:12649, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4524, Read name A00572:21:H3WK7DRXX:2:2243:12527:14888, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4528, Read name A00572:21:H3WK7DRXX:2:1175:18819:4899, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4530, Read name A00572:21:H3WK7DRXX:2:2175:28022:28917, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4538, Read name A00572:21:H3WK7DRXX:2:1205:15076:12790, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4540, Read name A00572:21:H3WK7DRXX:2:1210:1651:32252, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4543, Read name A00572:21:H3WK7DRXX:2:2169:19117:26146, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4594, Read name A00572:21:H3WK7DRXX:2:2203:10664:30201, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4608, Read name A00572:21:H3WK7DRXX:2:1230:29894:24111, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4610, Read name A00572:21:H3WK7DRXX:2:1230:29975:24314, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4612, Read name A00572:21:H3WK7DRXX:2:1257:11794:23077, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4614, Read name A00572:21:H3WK7DRXX:2:2114:12111:6245, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4659, Read name A00572:21:H3WK7DRXX:2:1163:32922:31610, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4662, Read name A00572:21:H3WK7DRXX:2:2177:16369:20791, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4673, Read name A00572:21:H3WK7DRXX:2:1105:9616:1078, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4676, Read name A00572:21:H3WK7DRXX:2:2243:6207:26428, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4678, Read name A00572:21:H3WK7DRXX:2:2243:6488:19116, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4692, Read name A00572:21:H3WK7DRXX:2:2155:29930:24674, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4707, Read name A00572:21:H3WK7DRXX:2:2109:28366:11851, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4711, Read name A00572:21:H3WK7DRXX:2:1163:19768:30248, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4741, Read name A00572:21:H3WK7DRXX:2:1120:16251:20556, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4743, Read name A00572:21:H3WK7DRXX:2:1124:30942:19695, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4745, Read name A00572:21:H3WK7DRXX:2:1244:13937:4961, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4747, Read name A00572:21:H3WK7DRXX:2:2253:26657:10269, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4755, Read name A00572:21:H3WK7DRXX:2:2206:29396:24721, Mate negative strand flag does not match read negative strand flag of mate ERROR: Record 4791, Read name A00572:21:H3WK7DRXX:2:1123:11171:17080, Mate negative strand flag does not match read negative strand flag of mate Maximum output of [100] errors reached.

quattrinia commented 4 years ago

Fixed with this work around as in Seq Answers above. Just need to check whether this would impact the coverage estimates at all.

List reads that have "Alignment start should != 0 because reference name != *"

samtools view -h input.bam | awk 'BEGIN {OFS="\t";} {if ($1=="^@") {print $0;} else {if ($4==0 && $3!="") {$4=1; print $1>"read_file"} if ($8==0 && $7!="") {$8=1;print $1>"read_file"} print $0;}}' | samtools view -bS - > input_OK.bam ; sort read_file | uniq > read_file_sorted

Exclude problematic reads

java -Xmx20g -jar FilterSamReads.jar INPUT=INPUT_OK.bam FILTER=excludeReadList READ_LIST_FILE=read_file_sorted OUTPUT=output.bam

quattrinia commented 4 years ago

Well, this worked in part. Still troubleshooting and will update here

quattrinia commented 4 years ago

I had to CleanSam first java -jar /data/mcfadden/aquattrini/PROGRAMS/picard-tools-2.1.1/picard.jar CleanSam I=ALC06-CL-RG-M-MD.bam O=ALC06-CL-RG-M-MD.clean.bam then samtools view -h ALC*clean.bam | awk 'BEGIN {OFS="\t";} {if ($1=="^@") {print $0;} else {if ($4==0 && $3!="*") {$4=1; print $1>"read_file"} if ($8==0 && $7!="*") {$8=1;print $1>"read_file"} print $0;}}' | samtools view -bS - > input_OK.bam ; sort read_file | uniq > read_file_sorted then java -Xmx20g -jar /data/mcfadden/aquattrini/PROGRAMS/picard-tools-2.1.1/picard.jar FilterSamReads INPUT=input_OK.bam FILTER=excludeReadList READ_LIST_FILE=read_file_sorted OUTPUT=output.bam