FrickTobias / BLR

MIT License
6 stars 5 forks source link

Investigate steps to skip in post-mapping processing for non-BLR technologies #213

Closed pontushojer closed 3 years ago

pontushojer commented 4 years ago

This is my current take on which steps to skip. Steps after mapping are:

  1. tagbam: Run for all
  2. clusterrmdup: Skip? It could however be interesting to see if any overlapping clusters are detected here. This could detect possible chimeric reads and doublets from the other technologies. But maybe we don't want this as a default.
  3. markduplicates: Run for all.
  4. buildmolecules: Run for all. Tags reads with molecule tags and saves molecule information to a TSV.
  5. filterclusters: Skip? This filters out barcodes that have too many molecules (thereby risking overlap between them) and also removes duplicates. The remove duplicates step could be nice to keep but this could also be do just using samtools for the non-BLR techs.

I discussed this briefly with @FrickTobias who mentioned this as a part of https://github.com/NBISweden/BLR/pull/16.

pontushojer commented 3 years ago

We have now added the config parameter skip_bcmerge that allows one to skip barcode merging. The filterclusters parameter max_molecules_per_bc could now also be set to 0 to skip filtering. Closing this.