Closed ctsa closed 5 years ago
The readme suggests this was fixed https://ftp-trace.ncbi.nlm.nih.gov/ReferenceSamples/giab/data/AshkenazimTrio/HG002_NA24385_son/NIST_Illumina_2x250bps/novoalign_bams/README_update_feb2019
Update: February 6, 2019
Because of an error in merging BAM files, the files previously in this directory had read duplicates. The reads have been realigned/re-merged with the same version of and options for novoalign as described below, and the current BAM files in the directory are now accurate.
Yes, this is now fixed - thanks for the reminder to close!
The following BAM file for HG002:
ftp://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/data/AshkenazimTrio/HG002_NA24385_son/NIST_Illumina_2x250bps/novoalign_bams/HG002.hs37d5.2x250.bam
...seems to erroneously contain 2 copies of every read pair. For instance a simple view of the bam shows:
...and so on for every read.