CRC-FONDA / A2-metagenome-snakemake

Collection of metagenomic read mapping workflows that have been tailored for different computational architectures.
0 stars 0 forks source link

Consolidate read mapping results #10

Closed eaasna closed 2 years ago

eaasna commented 3 years ago

DREAM-Yara can be taken as an example to see how to consolidate read mapping results from MG-2.

If a read is mapped against multiple bins then all alignments have to be compared to find the best mapping and offer strata mapping based on that

eaasna commented 3 years ago

all_one_match.sam was generated from read mapping results where clustering of reference sequences had been perfect s.t genomes in a bin where similar to each other and different from other bins.

all_sorted.sam was generated from read mapping results where the reference bins had been mixed s.t separate bins contained similar reference sequences.

image

The goal is to find for each read the optimal mapping across bins and filter out anything below best+x.

eaasna commented 3 years ago

Add match-consolidator to the workflow.

Should there be a step to split the all_collated.sam file so that the match consolidator step could be run multi-threaded?