csiro-crop-informatics / biokanga_align_paper

0 stars 0 forks source link

Introduction

Existing aligners

how detailed do we need to be? different alignment approaches. cover pseudo alignment? ie kallisto, salmon?

Methods

Simulated reads procedure

Much of this can be taken care of by using https://github.com/karel-brinda/rnftools

for each assembly #may be informative to show arabidopsis, rice, barley, wild emmer wheat, bread wheat 
  for each read_length #fix?
    for each err rate 
      simulate n-fold coverage reads (proportional to assembly/genome size) 
      for each param settings in {default, ..., ...} 
        align reads to ref using each tool
        quantify speed, accuracy

Results

Simulated reads

I can imagine a figure showing the fate of reads under different aligners. A sort of snakey diagram showing where they end up, and whether it's the best place or not.

bioplatforms reads

Summary of different SNPs reported (ie what is the overlap in the SNPs identified using the different aligners - would need to use the same SNPs caller on the BAM output I guess)? I don't actually know what the differences would look like. Would be nice to have an example or two of where a difference is. Maybe this is really down to parameters (ie how many reads you need to trust the call?)

Discussion