This repository contains data indexes from NIST's Genome in a Bottle (GIAB) project. The indexes for sequences and alignments are also available under: https://ftp.ncbi.nlm.nih.gov/ReferenceSamples/giab/data_indexes .
AshkenazimTrio
Son:HG002 https://ftp.ncbi.nlm.nih.gov/ReferenceSamples/giab/data/AshkenazimTrio/HG002_NA24385_son/
Father:HG003 https://ftp.ncbi.nlm.nih.gov/ReferenceSamples/giab/data/AshkenazimTrio/HG003_NA24149_father/
Mother:HG004 https://ftp.ncbi.nlm.nih.gov/ReferenceSamples/giab/data/AshkenazimTrio/HG004_NA24143_mother/
Sequencing Platform | Sequence | Alignment |
---|---|---|
Illumina WGS 2x150bp 300X per individual | All HG002 HG003 HG004 | novoalign: All HG002 HG003 HG004 |
Illumina 6KB Matepair | All HG002 HG003 HG004 | bwamem:hg19 All HG002 HG003 HG004 |
Illumina WGS 2X250bp | All HG002 HG003 HG004 | isaac:hg19 All HG002 HG003 HG004 novoalign: All HG002 HG003 HG004 |
Moleculo | All HG002 HG003 HG004 | |
Illumina Whole Exome | - | bwamem:hg19 All HG002 HG003 HG004 |
SOLiD 60x for son | All HG002 | LifeScope:hg19 All HG002 |
CompleteGenomics | - | CGAtools:hg19 All HG002 HG003 HG004 |
Ion Proton 1000x Exome | - | TMAP:hg19 All HG002 HG003 HG004 |
10X Genomics | - | bwamem:hg19 All HG002 HG003 HG004 |
10X Genomics ChromiumGenome | All HG002 | LongRanger2.0:hg19 All HG002 HG003 HG004 |
BioNano | All:bnx HG002:bnx HG003:bnx HG004:bnx | All:cmap HG002 HG003 HG004 |
PacBio 70x/30x/30x | All HG002 HG003 HG004 All:hdf5 HG002 HG003 HG004 |
NGMLR:hg19 All HG002 HG003 HG004 minimap2: All HG002 HG003 HG004 |
PacBio CCS 10kb | All HG002 | pbmm2:hg19 All HG002 |
PacBio CCS 11kb | All HG002 | pbmm2:hg19 All HG002 |
PacBio CCS 15kb | All HG002 | pbmm2:hg19 All HG002 |
PacBio CCS 15kb_20kb chemistry2 | All HG002 | pbmm2: All HG002 HG003 HG004 |
Oxford Nanopore 2D | All HG002 | - |
Oxford Nanopore ultralong (guppy-V3.2.4_2020-01-22) | All HG002 | minimap2:whatshap:hg19 All HG002 |
Oxford Nanopore ultralong Promethion | All HG002 HG003 HG004 | - |
BGI BGISEQ500 | All HG002 | - |
BGI MGISEQ PCR-free | All HG002 | - |
BGI stLFR | All HG002 HG003 HG004 | All:bwamem:hg19 HG002 HG003 HG004 |
Strand-Seq HG002 by BCCRC | All HG002 | - |
* CompleteGenomics LFR raw or alignment data not available, but analysis results available under: https://ftp.ncbi.nlm.nih.gov/ReferenceSamples/giab/data/AshkenazimTrio/analysis/CompleteGenomics_newLFR_CGAtools_06122015/
ChineseTrio
Son:HG005 https://ftp.ncbi.nlm.nih.gov/ReferenceSamples/giab/data/ChineseTrio/HG005_NA24631_son/
Father:HG006 https://ftp.ncbi.nlm.nih.gov/ReferenceSamples/giab/data/ChineseTrio/HG006_NA24694-huCA017E_father/
Mother:HG007 https://ftp.ncbi.nlm.nih.gov/ReferenceSamples/giab/data/ChineseTrio/HG007_NA24695-hu38168_mother/
Sequencing Platform | Sequence | Alignment |
---|---|---|
Illumina WGS 2x250bp 300X for son; 2x150bp 100x for parents |
All HG005 HG006 HG007 | novoalign: All:hg19-hg38 HG005:hg19-hg38 HG006:hg19-hg38 HG007:hg19-hg38 |
Illumina 6KB Matepair | All HG005 HG006 HG007 | |
Moleculo | All HG005 HG006 HG007 | |
SOLiD 60x for son | All:xsq HG005:xsq | LifeScope: All:hg19 HG005:hg19 |
CompleteGenomics | CGAtools: All:hg19 (RMDNA) HG005:hg19 HG006:hg19 HG007:hg19 CGAtools: All:hg19 (cellsDNA) HG005:hg19 |
|
Illumina Whole Exome | bwamem: All:hg19 HG005:hg19 | |
Ion Proton 1000x Exome | TMAP: All:hg19 HG005:hg19 | |
BioNano for son | All:bnx HG005:bnx | All:hg19 (cmap) HG005:hg19 (cmap) |
PacBio Sequel for the trio | All HG005 HG006 HG007 | |
PacBio SequelII CCS 11kb | |
|
BGI BGISEQ500, MGISEQ, stLFR | |
NA12878
NA12878:HG001 https://ftp.ncbi.nlm.nih.gov/ReferenceSamples/giab/data/NA12878/
Sequencing Platform | Sequence | Alignment |
---|---|---|
Illumina WGS 2x150bp 300X | HG001 | bwamem: HG001:hg19 (downsampled30x) novoalign: HG001 |
Illumina HiSeq Exome | HG001 HG001:trimmed_fastq |
bwamem: HG001:hg19 |
Illumina TruSeq Exome | bwamem: HG001:hg19 | |
10X Genomics | bwamem: HG001:hg19 bwamem: HG001:hg19 (size_selected) |
|
10X Genomics ChromiumGenome | LongRanger2.0: HG001:hg19-hg38 LongRanger2.1: HG001:hg19-hg38 |
|
CompleteGenomics | CGAtools: HG001:hg19 | |
Ion Proton 1000x Exome | TMAP: HG001:hg19 | |
NA12878 SOLiD5500W | LifeScope: HG001:hg19 | |
BGI BGISEQ500, MGISEQ, stLFR | ||
PacBio 40x | HG001:hdf5 | |
PacBio SequelII CCS 11kb | ||
Ultralong_OxfordNanopore | - |
minimap2: HG001 |
Please Note:
1. If you want to use raw sequencing data (fastq, fasta, hdf5, xsq, bnx etc) for your analysis, then you can use the sequence.index. files when you need to download the data.
2. If you want to use aligned data (bam, xmap/cmap etc.) for your analysis, then you can use the alignment.index. files when you need to download the data.