Closed jo-mc closed 2 years ago
Hi, in the aggregated results we report the canonical version of the strings we found (canonical version: lexicographical minimum between the string and its reverse-and-complement). For this reason you cannot find that string but actually you can find its reverse-and-complement (GCTCCCATG
).
Thanks for pointing this out btw: we'll better specify it in the README.
Closing.
After running the example data, and looking at the aggregated ouput, The specific string and associated reads, sometimes do not match with the data/ID's in child.fq ? is this correct?
Matching case: string in read_ids_aggregated.fasta: TGCCAGGAA ID: m54329U_190619_052546/165412986/ccs$
Here we find a match in child.fa 1 @m54329U_190619_052546/165412986/ccs 2 CCATCTCAAAAAATCAATCAATCAATAAATCAATACATA............
Non-matching case; string in read_ids_aggregated.fasta: CATGGGAGC ID's: m54329U_190629_180018/58722384/ccs$m54329U_190617_231905/26280060/ccs$
Here we do not find the expected matching read ID's child.fa for string, but different ID's do match: 1 @m54329U_190619_052546/165412986/ccs 2 CCATCTCAAAAAATCAATCAATCAATAAATCAATACAT................... 29 @m54329U_190615_010947/134415974/ccs 30 GTAGGGAACACAGTCGGGCTAGAAAGTCCATTGACCACTCAGGGCCAT.....................