Closed YuanlongLiu closed 3 years ago
Thanks for repporting this! It has been fixed in the latest commit.
Hi, it should be:
new_id_R1 <- paste0(umis, ":", "UMI4C:", filtered_reads + seq(1, length(filtered_reads_fqR1)), ":R1")
not
new_id_R1 <- paste0(umis, ":", "UMI4C:", seq(filtered_reads + 1, length(filtered_reads_fqR1)), ":R1")
In the .singlePrepUMI4C function, the new_id_R2, new_id_R1 are computed from 'seq_len(length(filtered_reads_fqR1))', which can duplicate for different iterations of the 'repeat' function, i.e, different reads will get the same read number
This new_id_xx is used later on in 'gr_sp <- GenomicRanges::GRangesList(GenomicRanges::split(gr, gr$readID))'. I have the feeling that this will cause problem