Open racng opened 1 month ago
Thanks for bringing that up, small detail but can have big impacts. There are several steps happening sequentially for the sequence processing, including removal of the R2_Leading_Trim sequence (which includes the 'T' that you are wondering about). It's removed first and then in a following step, sequences are trimmed for the R2_Leading_Trim_ODN
sequence.
Basically if you don't remove that 'T' from the sequences initially with the R2_Leading_Trim
trimming step, then you would want to include it for removal in the R2_Leading_Trim_ODN
trimming step (rule: seq_trim_ic_read_odn
).
You are correct, since that 'T' is incorporated with the dsODN, but it's not primed against, then it could be included in the R2_Leading_Trim_ODN
sequence and treated like it was part of the "bit" of sequence that we added to the dsODN. Without the bit sequence, that 'T' could be used in the same way, to confirm the priming of the genomic location is associated with a true incorporation and not a mispriming site, but given that a single T is pretty common in the genome, that doesn't make for a very good confirmation.
Happy to connect through email as well if you would like to discuss.
In the
simulation.config.yaml
, R2_Leading_Trim_ODN is specified asACGCGA
. However, should it beTACGCGA
instead? Since this would be the full subsequence of the iGUIDE that is not primed by the GSP2_neg_Nuc_off primer from the iGUIDE paper? Or is this shortened on purpose?