rr1859 / R.4Cker

MIT License
16 stars 15 forks source link

nearBait analysis comparing to farCis analysis #10

Open tlgolan opened 8 years ago

tlgolan commented 8 years ago

Hi, For some of my 4Cseq libraries I managed to complete the 4Cker analysis for near bait and far cis regions. When I examine the results I noticed that in a 10Mb window around the bait (I used 6bp cutter as primary RE) there are different interactions, depends on the analysis (nearBait/farCis), for the same bait and biological condition. I used k=5 for nearBait and k=10 for farCis. Is this normal? If so, to which interactions I should 'believe'? Thanks

rr1859 commented 8 years ago

I would expect them to be of different resolution since the k is different but still similar. How different are the calls? Can you send a screenshot of your results with the raw data and calls from the two analysis?

tlgolan commented 8 years ago

Hi, Attached are the output files of nearBait and farCis analysis for two 4Cseq libraries- R_SFM, W_Fib, in text format. I used k=5 for nearBait and k=10 for farCis . Also, from the cis analysis files, I noticed that most of the cis-chromosome regions are interacting with the bait (mostly for W_Fib library), which seems weird to me.

Thanks for your help and time

R_SFM_cis_highinter_k10.txt R_SFM_nearbait_highinter_k5.txt W_Fib_cis_highinter_k10.txt W_Fib_nearbait_highinter_k10.txt

rr1859 commented 8 years ago

Hi, Would it be possible to send the raw data?

tlgolan commented 8 years ago

Hi, Of course. Attached the raw data of Myh7_Fib and R_SFM_Myh7 libraries (only cis reads).

dataframe_Myh7_Fib_Mapq0_allFrag.txt dataframe_R_SFM_Myh7_Mapq0_allFrag.txt

rr1859 commented 8 years ago

The coverage in your samples looks very sparse by eye - I would really check to be sure it passed QC. Also when you load the file in IGV and set it to autoscale - there seems to be one fragment near that bait that still has a very high count - are you sure the self-ligated and undigested fragments have been removed?

tlgolan commented 8 years ago

Hi, The coverage indeed is very sparse (in 2Mb window around the bait; it is estimated as ~10%. The estimation was made by using the Basic4cseq package). Beside that, the libraries passed QC (total read number >1M, cis/total>40%). I removed the self-ligated and undigested fragments as followed: I found the primer fragment and removed the most proximal fragment, above and below the bait fragment, containing read number larger then 1. Also, how can you explain the fact that according to the output of 4Cker, almost all the nearBait region is defined as interaction/s? Thanks again for your kind help

rr1859 commented 8 years ago

You have one fragment that has a count of 731,794 and the counts for the fragments adjacent to that drop to ~100-500. If that fragment is not the undigested or self-ligated I would first try to understand why that has such a high count based on where it is located relative to the bait and the RE sites around it - then run the QC again without that fragment. It is very unusual to have a single fragment with such a high count near the bait with no neighboring fragments supporting it. I think the model is not correctly learning the effect of the distance from the bait on the counts and I would not trust any of the calls since they do not seem to match your raw data. Also if you are interested in the interactions near the bait having only ~10% coverage would not be enough to call interactions

tlgolan commented 8 years ago

Hi Ramya,The fragment containing 731794 reads is the primer (bait) fragment.. Does it make sense to you?

On Tuesday, August 16, 2016 6:33 PM, Ramya Raviram <notifications@github.com> wrote:

You have one fragment that has a count of 731,794 and the counts for the fragments adjacent to that drop to ~100-500. If that fragment is not the undigested or self-ligated I would first try to understand why that has such a high count based on where it is located relative to the bait and the RE sites around it - then run the QC again without that fragment. It is very unusual to have a single fragment with such a high count near the bait with no neighboring fragments supporting it. I think the model is not correctly learning the effect of the distance from the bait on the counts and I would not trust any of the calls since they do not seem to match your raw data. Also if you are interested in the interactions near the bait having only ~10% coverage would not be enough to call interactions— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

rr1859 commented 8 years ago

Then that should be the undigested fragment and it should be removed.