cnobles / iGUIDE

Bioinformatic pipeline for identifying dsDNA breaks by marker based incorporation, such as breaks induced by designer nucleases like Cas9.
https://iguide.readthedocs.io/en/latest/
GNU General Public License v3.0
20 stars 9 forks source link

Specification of R2_Leading_Trim_ODN #87

Open racng opened 1 month ago

racng commented 1 month ago

In the simulation.config.yaml, R2_Leading_Trim_ODN is specified as ACGCGA. However, should it be TACGCGA instead? Since this would be the full subsequence of the iGUIDE that is not primed by the GSP2_neg_Nuc_off primer from the iGUIDE paper? Or is this shortened on purpose?

R1_Leading_Trim : "."
R1_Overreading_Trim : "TCGCGTATACCGTTATTAACATATGACAACTCAA"
R2_Leading_Trim : "TTGAGTTGTCATATGTTAATAACGGTAT"
R2_Leading_Trim_ODN : "ACGCGA"
R2_Overreading_Trim : "AGATCGGAAGAGCGTCGTGT"

image

cnobles commented 1 month ago

Thanks for bringing that up, small detail but can have big impacts. There are several steps happening sequentially for the sequence processing, including removal of the R2_Leading_Trim sequence (which includes the 'T' that you are wondering about). It's removed first and then in a following step, sequences are trimmed for the R2_Leading_Trim_ODN sequence.

Basically if you don't remove that 'T' from the sequences initially with the R2_Leading_Trim trimming step, then you would want to include it for removal in the R2_Leading_Trim_ODN trimming step (rule: seq_trim_ic_read_odn).

You are correct, since that 'T' is incorporated with the dsODN, but it's not primed against, then it could be included in the R2_Leading_Trim_ODN sequence and treated like it was part of the "bit" of sequence that we added to the dsODN. Without the bit sequence, that 'T' could be used in the same way, to confirm the priming of the genomic location is associated with a true incorporation and not a mispriming site, but given that a single T is pretty common in the genome, that doesn't make for a very good confirmation.

Happy to connect through email as well if you would like to discuss.