Pasquali-lab / UMI4Cats

An R package for analyzing UMI-4C chromatin contact data.
https://pasquali-lab.github.io/UMI4Cats/
5 stars 3 forks source link

Apply to reads with unequal read length #7

Closed YuanlongLiu closed 3 years ago

YuanlongLiu commented 3 years ago

Hi, just another comment, there can be situations when the read lengths in fastq are not unique. For example, after adaptor trimming. In the package there are multiple places where the "unique" function has been used to collect information from different reads, should be an issue.

msubirana commented 3 years ago

Hi YuanlongLiu,

UMI4Cats is designed for unique fastq reads. It is not necessary trimming the reads due to there are a strict QC process where reads that not present an adequate quality are filtered out.

The "unique" function is necessary for the collapsing of PCR duplicates and for the correct generation of the UMIs.

YuanlongLiu commented 3 years ago

Hi I mean these ones:

R/statsUMI4C.R:

R/demultiplexFastq.R:

R/contactsUMI4C.R: