Open katreya42 opened 2 years ago
Hi, the GAM model we are fitting in "calculate-cnv" is designed to account for biased induced by GC content, mappability, and density of restriction enzyme sites. I think the reason for this error is that pulldown-based contact maps contain additional sources of variances that cannot be explained by the model. Have you tried running "calculate-cnv" at lower resolutions (25kb/50kb/100kb)?
I had exactly the same issue; I was using Micro-C, so also MNase digested, and ran into exact same error message at 10kb.
Processed the data at 50kb, and it seems to be working now.
I had exactly the same issue; I was using Micro-C, so also MNase digested, and ran into exact same error message at 10kb.
Processed the data at 50kb, and it seems to be working now.
same issue for my data at 10kb. 100kb is fine, but the assemble-complexSVs seems require higher resolution at least 25kb?
Hey guys, thanks for these valuable feedbacks, I will seriously consider this issue in my future versions. Since I might need more time to investigate this specific case for --enzyme uniform
, I just want to remind you that you don't have to perform the CNV normalization before you run assemble-complexSVs
and neoloop-caller
. In theory running in --balance-type ICE
mode can also give equally accurate results if your sample doesn't have severe copy number variations.
Hi there and thank you for this great tool.
I'm trying to run CNV-normalization on MNase-digested HiChIP samples that had been processed through HiC-Pro and JuicerTools to make hic maps.
Running 'hic2cool convert' and 'cooler balance' both complete fine (cool file generated is at 10kb resolution).
Setting the '--enzyme uniform' flag leads to a runtime error:
calculate-cnv --hic MG63_3_3T3_rep2_GW7_10000.cool --genome hg19 --enzyme uniform --output MG63_3_3T3_rep2_GW7_10000_detected_cnv.txt ARGUMENT LIST: Cool URI = MG63_3_3T3_rep2_GW7_10000.cool Reference Genome = hg19 Enzyme = uniform Output Path = MG63_3_3T3_rep2_GW7_10000_detected_cnv.txt Cache Folder = /mnt/pan/SOM_GEN_PXS183/jdc14/HICHIP/OSTEO/CAS2997_Homo_sapiens_072921/MG63_3_3T3_rep2_GW7 Log file name = cnv-calculation.log root INFO @ 11/18/21 11:31:19: hg19_mappability_50mer.bw was not found in the cache folder, downloading ... root INFO @ 11/18/21 11:32:07: hg19.uniform.npz was not found in the cache folder, downloading ... root INFO @ 11/18/21 11:32:14: hg19_1kb_GCPercent.bw was not found in the cache folder, downloading ... root INFO @ 11/18/21 11:32:28: Calculate the 1D coverage from Hi-C matrix ... root INFO @ 11/18/21 11:32:39: Load GC content ... root INFO @ 11/18/21 11:35:41: Load mappability scores ... root INFO @ 11/18/21 11:39:11: Count the number of cut sizes for each bin ... root INFO @ 11/18/21 11:41:43: Filter out invalid bins ... numexpr.utils INFO @ 11/18/21 11:41:43: Note: NumExpr detected 16 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8. numexpr.utils INFO @ 11/18/21 11:41:43: NumExpr defaulting to 8 threads. root INFO @ 11/18/21 11:41:43: Done root INFO @ 11/18/21 11:41:43: Fitting a Generalized Additive Model with log link and Poisson distribution ... Traceback (most recent call last): File "/home/jdc14/miniconda3/envs/neoloop/bin/calculate-cnv", line 147, in run gam = mgcv.gam(fomula, family=stats.poisson(link='log')) File "/home/jdc14/miniconda3/envs/neoloop/lib/python3.7/site-packages/rpy2/robjects/functions.py", line 178, in call return super(SignatureTranslatedFunction, self).call(*args, *kwargs) File "/home/jdc14/miniconda3/envs/neoloop/lib/python3.7/site-packages/rpy2/robjects/functions.py", line 106, in call res = super(Function, self).call(new_args, **new_kwargs) rpy2.rinterface.RRuntimeError: Error in smooth.construct.tp.smooth.spec(object, dk$data, dk$knots) : A term has fewer unique covariate combinations than specified maximum degrees of freedom
I tried setting '--enzyme Arima' out of curiosity (which I had used in the past for some Arima data) and calculate-cnv completed with no issues.
Attached is a shot of the unnormalized hic matrix at 10kb resolution around chr8:119Mb-125Mb to get an idea of matrix sparsity.
Thanks for any help you can provide.