Closed seryrzu closed 4 years ago
Hello @seryrzu
If I am not wrong, you are looking for the alignment of simulated reads to the reference genome in SAM format? If that is the case, then you can simply align them using minimap2 with the -a option. I couldnt undersdtand what to you mean by true alignment though.
I can technically align reads but in case of read originating from repetitive part of the genome the alignment produced with minimap2 or other tool can be wrong. Art Illumina (for short reads) and Simlord (for PacBio reads) provide a SAM file with the alignment that corresponds to the true origin of the read. Having these files facilitates downstream analysis because I don't have to think of potential flaws of alignment when I'm benchmarking something else and am using alignment as ground truth.
This feature has been raised long ago by several users. We haven't implemented yet because we feel it is kind of redundant as to what we already have. Actually, the headers of simulated reads have suggested where the true alignment would start, and the error_profile contains the location and type of introduced errors on each read. These two should be sufficient to trace the original reference sequence.
It would be really helpful to report the true alignment of reads to the reference in SAM format. For example, Simlord does for simulating PacBio reads.