Open ashishjain1988 opened 7 months ago
Hi @ashishjain1988 , it is hard to tell why some MCC values are missing. I won't be concerned about it. More important is to find acceptable True and Falce positive rate cutoffs. I'd be conservative and pick 10 but 7-8 is also OK. We already discussed that HiCcompare is robust to the choice of A https://github.com/dozmorovlab/HiCcompare/issues/29#issuecomment-1535572871 because small differences are unlikely to be detected as statistically significant. I'll keep an eye on missing MCC values and debug when have an example.
Hi @mdozmorov , thank you for your response. This data is more deeply sequenced than the previois one. One thing I want to ask is the TPR and FPR. Based on this plot it seems like the False Positive rate is way higher than the true positive rate at A.min=10. Is still that a good threshold? Also, the default threshold of 2 is not giving us any significant contacts.
I overlooked the curves are inverted, this is indeed confusing. Here's the explanation from my student, @hamy12398:
Their plot can happen since it can depend on number of changed they set. (ex above, I set numberChanges to 30). Since MCC is based from products of different sum pairs of TP, TN, FP, FN in their denominator in their fraction function, so by some chance if this denominator = 0, it can cause MCC to be undefined.
What are the parameters you used for filter_params()
? Can you try with numChanges = 30
?
I was actually carrying out the analysis using 25kbp resolution and as mentioned in the manual i proportionally increased the numChanges to 2500 (filter_params(hic.list[[i]],numChanges = 2500)
). Is that too much for 25kbp resolution? I will try out the numChanges = 30
too. Thanks!
Below is the plot I got using the filter_params
function for chromosome 4. The resolution I used is 25kbp and numChanges = 30
. It seems like the all the results are FPR
It is hard to tell without seeing the data. Have you tried to visualize single matrices? It may be the data is very sparse at 25k resolution.
This is how the contact data looks like for individual samples. The scale is log2.
The data looks good. I still cannot say why your A plot looks strange. Try debugging of the actual function. Again, A threshold is not that critical, I would explore the MD plot, call differential interactions and visualize them.
Thanks! I will look into that.
Hi,
I am trying using the filter_params function to select the optimum A.min values for filtering. We are interested in contacts on chromomse 4. When I check the plot, it seems to not have the MCC values for A values (approx from 2 to 8). Is there an reason for the package to not able to calculate the MCC values? Here is the plot that I got for chromosome 4.