cool file and output file

hansenlab / bnbc

1 stars 0 forks source link

Open Nuturetree opened 1 week ago

Nuturetree commented 1 week ago

Hi ! author Thank you for developing such a user-friendly tool and providing detailed documentation. I have two simple questions I'd like to ask:

Does a cool file need to be balanced (like: cooler balance test.cool)?
How can BNBC write the batch effect removal results to an output file? Looking forward to your response. Nutures

cafletezbrant commented 1 week ago

Hi Nutures,

Thanks for the kind words! To your questions:

No, matrix balancing is not required for use of BNBC, if your downstream analytic goal is to compare Hi-C data across samples. Note that comparison of interactions within one sample does require correction such as matrix balancing, however.
Can you clarify what you are after here? As a function, bnbc::bnbc() returns a set of matrices, represented as a contactGroup class. The matrices themselves can be obtained by calling bnbc::contacts() on the output of bnbc::bnbc(). You can write these matrices to disk in your favorite way of writing R matrices/data.frames to disk.

Hope this helps, Kipper

Nuturetree commented 1 week ago

Hi Kipper, Thank you very much for your prompt response. I need to confirm a few details with you:

I can use cooler to generate a cool file. Does this file need to be balanced built-in balance process before it is used as an input for BNBC?
I noticed that BNBC has its own data normalization process, cgEx.cpm <- logCPM(cgEx) and cgEx.smooth <- boxSmoother(cgEx.cpm, h=5). Is this step necessary?
How should the statement "Note that comparison of interactions within one sample does require correction such as matrix balancing, however" be understood? Could you provide an example for clarification?

Looking forward to your response and thank you very much. Nutures

cafletezbrant commented 6 days ago

Hi,

No, matrix balancing is not required for use of BNBC.
logCPM is essential, boxSmoother often gives good performance in improving concordance between replicates.
I think my question is very germane - what is your analytic goal? What comparisons are you trying to make?