Pasquali-lab / UMI4Cats

An R package for analyzing UMI-4C chromatin contact data.
https://pasquali-lab.github.io/UMI4Cats/
4 stars 3 forks source link

UMI counts look strange #15

Closed amhaaning closed 1 month ago

amhaaning commented 2 months ago

Hello! I am using your package with 4C-seq data from two biological replicates from a single condition, and the UMI counts look really strange. I used your vignette code to read in my data and generate UMIs from forward and reverse reads - raw and with optical duplicates removed (dedup). Based on various QC metrics from fastqc and Picard tools, the two replicates look extremely similar. Rep2 has slightly more reads, but that's the only notable difference. However, the number of UMIs generated with this package is extremely different between the two replicates, with the direction of the change flipping after duplicate removal. I have analyzed the data separately two other ways, using the pipe4C R package and my own custom code, and I'm not seeing major differences in the overall abundance of counts. Attached is the plot generated by statsUMI4C. Do you know what could be causing this to happen? I would really like to use your package in the future to compare two conditions, so I'd like to understand what is going on with the UMI counts. Thanks! UMI4Cats_Stats_dedup.pdf

mireia-bioinfo commented 1 month ago

Hi Allison! Our pakage is only appropriate for the use of UMI-4C data (https://www.nature.com/articles/nmeth.3922) and will not work properly for 4C-seq data. UMI-4C has an additional sonication step (among other things) that our package takes advantage of to remove PCR duplicates, which is probably the cause for your conflicting results.