Pasquali-lab / UMI4Cats

An R package for analyzing UMI-4C chromatin contact data.
https://pasquali-lab.github.io/UMI4Cats/
5 stars 3 forks source link

[BUG] Different reads have the same read numbder #5

Closed YuanlongLiu closed 3 years ago

YuanlongLiu commented 3 years ago

In the .singlePrepUMI4C function, the new_id_R2, new_id_R1 are computed from 'seq_len(length(filtered_reads_fqR1))', which can duplicate for different iterations of the 'repeat' function, i.e, different reads will get the same read number

This new_id_xx is used later on in 'gr_sp <- GenomicRanges::GRangesList(GenomicRanges::split(gr, gr$readID))'. I have the feeling that this will cause problem

mireia-bioinfo commented 3 years ago

Thanks for repporting this! It has been fixed in the latest commit.

YuanlongLiu commented 3 years ago

Hi, it should be:

new_id_R1 <- paste0(umis, ":", "UMI4C:", filtered_reads + seq(1, length(filtered_reads_fqR1)), ":R1")

not

new_id_R1 <- paste0(umis, ":", "UMI4C:", seq(filtered_reads + 1, length(filtered_reads_fqR1)), ":R1")