iqbal-lab-org / gramtools

Genome inference from a population reference graph
MIT License
92 stars 15 forks source link

Pf benchmarking #152

Open bricoletc opened 4 years ago

bricoletc commented 4 years ago

We'll put Pf PRG benchmarking figures here

Data explained:

dataset abbreviation description path on yoda cluster
pf3k cortex + DBLMSP1/2 pf3k 2.4 million variant calls made using cortex on pf3k samples + Sorina's variants on msp3.4 and msp3.8 genes. Due to variant overlap this ends up as 1.56 million variant sites in the PRG as per a6e9094 /nfs/leia/research/iqbal/bletcher/projects/gramtools/datasets/pf3k_release3_cortex_plus_dblmsps/vcf/pf3k_and_DBPMSPS1and2_cleaned.vcf.gz
Same as ^^, but chromosomes 1 and 2 only pf3k_chr1_2 88,539 sites after gramtools build /nfs/leia/research/iqbal/bletcher/projects/gramtools/datasets/pf3k_release3_cortex_plus_dblmsps_Chroms1_2
GB4 reads mapped to Pf chroms 1 and 2 GB4_chr1_2 3.46 million Illumina WGS 100bp paired-end reads from Pf GB4 strain. The reads have been quality trimmed using trimmomatic and subsetted to Pf Chroms 1&2 (using samtools). /nfs/leia/research/iqbal/bletcher/projects/Pf_cross_benchmark/runs/mini_Trio/samples/GB4
bricoletc commented 4 years ago
commit command dataset kmer size peak RAM (GBytes average RAM (Gbytes) run time (seconds) CPU time (seconds) mapped reads per CPU sec
73e4835 build pf3k 11 38 12.4 6084 = -
73e4835 quasimap GB4_chr1_2 -> pf3k 11 38 37 40,315 183,405 (10 threads) 17.6
73e4835 build pf3k_chr1_2 11 2.39 0.87 378 346 -
73e4835 quasimap GB4_chr1_2 - > pf3k_chr1_2 11 3.37 3.31 1219 5714 566
b95321b (v1.6.0) build pf3k_chr1_2 11 3.45 1.80 341 328 -
b95321b (v1.6.0) quasimap GB4_chr1_2 - > pf3k_chr1_2 11 3.37 3.31 1041 4839 672