XiaoTaoWang / NeoLoopFinder

A computation framework for genome-wide detection of enhancer-hijacking events from chromatin interaction data in re-arranged genomes
Other
53 stars 16 forks source link

Run segment-cnv with KeyError: 0 #29

Closed jxulab closed 2 years ago

jxulab commented 2 years ago

Hi,

Thanks for developing the wonderful tool.

Here is the commad I run (I didn't paste the loop here for simplicity).

  1. hic2cool convert -p 28 -r 50000 hic_file/${samples[i]}_rawdata_allValidPairs.hic cooler_file/${samples[i]}.cool

  2. cooler balance -p 8 -c 10000 $work_dir/cooler_file/${samples[i]}.cool

  3. calculate-cnv -H $work_dir/cooler_file/${samples[i]}.cool -g hg19 -e DpnII --output $work_dir/NeoLoopFinder/${samples[i]}.cnv --cachefolder $work_dir/NeoLoopFinder/cachefolder

  4. segment-cnv --cnv-file $work_dir/NeoLoopFinder/${samples[i]}.cnv --binsize 10000 --output $work_dir/NeoLoopFinder/${samples[i]}.segmented.cnv --nproc 8

Step #1, 2, 3 all work fine. But step #4 give me error as follow.

root INFO @ 11/30/21 11:34:35:

ARGUMENT LIST:

CNV Profile = /endosome/archive/CRI/Xu_lab/Data/Hi-C/Leukemia_cells/NeoLoopFinder/HiC_AML4009.cnv

Ploidy = 2

Bin Size = 10000

Output Path = /endosome/archive/CRI/Xu_lab/Data/Hi-C/Leukemia_cells/NeoLoopFinder/HiC_AML4009.segmented.cnv

Number of Processes = 8

Log file name = cnv-seg.log

root INFO @ 11/30/21 11:34:36: Loading CNV profile ... root INFO @ 11/30/21 11:34:46: Perform segmentation ... neoloop.cnv.segcnv INFO @ 11/30/21 11:34:46: Segmenting Chromosome 1 ... neoloop.cnv.segcnv INFO @ 11/30/21 11:34:51: Estimated HMM state number: 9 (log scale) Traceback (most recent call last): File "/project/CRI/Xu_lab/shared/softwares/neoloop/bin/segment-cnv", line 87, in run work.segment() File "/project/CRI/Xu_lab/shared/softwares/neoloop/lib/python3.7/site-packages/neoloop/cnv/segcnv.py", line 139, in segment sig, hmm_seg, scale = self._segment(sig) # pure HMM-based segmentation File "/project/CRI/Xu_lab/shared/softwares/neoloop/lib/python3.7/site-packages/neoloop/cnv/segcnv.py", line 286, in _segment model.bake() File "pomegranate/hmm.pyx", line 755, in pomegranate.hmm.HiddenMarkovModel.bake File "/cm/shared/apps/python/3.7.x-anaconda/lib/python3.7/site-packages/networkx/classes/reportviews.py", line 178, in getitem return self._nodes[n] KeyError: 0

I checked the input file 'HiC_AML4009.cnv' of segment-cnv. It looks like this.

$ head HiC_AML01541.cnv chr1 0 50000 0.2093057112348844 chr1 50000 100000 0.08672812707698796 chr1 100000 150000 0.6961023808715332 chr1 150000 200000 0.2934959879616277 chr1 200000 250000 3.765375349435547 chr1 250000 300000 2.366041058766406 chr1 300000 350000 0.0 chr1 350000 400000 0.11157161895299161 chr1 400000 450000 0.007032050350192609 chr1 450000 500000 0.017605553528705187 ...

The file looks good to me. Have no idea where is wrong.

Any suggestion would be appreciated! Thank you!

XiaoTaoWang commented 2 years ago

Hi, I think this is a version compatibility issue of networkx or pomegranate. I know it might be annoying ... but can you make sure to install the same versions of these packages as listed here https://github.com/XiaoTaoWang/NeoLoopFinder#installation?

jxulab commented 2 years ago

Hi XiaoTao,

Thanks for your kind reply. I doubled checked the versions of these packages and make sure that they are all exactly the same versions as you specified in the website.

Here is the list.

conda list | grep python biopython 1.79 py37h5e8e339_1 conda-forge msgpack-python 1.0.2 py37h2527ec5_2 conda-forge python 3.7.1 h381d211_1003 conda-forge python-dateutil 2.8.2 pyhd8ed1ab_0 conda-forge python_abi 3.7 2_cp37m conda-forge (/project/CRI/Xu_lab/shared/softwares/neoloop) [s163795@NucleusC057 ~]$ conda list | grep cython cython 0.29.13 py37he1b5a44_0 conda-forge (/project/CRI/Xu_lab/shared/softwares/neoloop) [s163795@NucleusC057 ~]$ conda list | grep cooler cooler 0.8.6 py_0 bioconda (/project/CRI/Xu_lab/shared/softwares/neoloop) [s163795@NucleusC057 ~]$ conda list | grep numpy numpy 1.17.2 py37h95a1406_0 conda-forge (/project/CRI/Xu_lab/shared/softwares/neoloop) [s163795@NucleusC057 ~]$ conda list | grep scipy scipy 1.3.1 py37h921218d_2 conda-forge (/project/CRI/Xu_lab/shared/softwares/neoloop) [s163795@NucleusC057 ~]$ conda list | grep joblib joblib 0.13.2 py_0 conda-forge (/project/CRI/Xu_lab/shared/softwares/neoloop) [s163795@NucleusC057 ~]$ conda list | grep scikit-learn scikit-learn 0.20.2 py37_blas_openblashebff5e3_1400 [blas_openblas] conda-forge (/project/CRI/Xu_lab/shared/softwares/neoloop) [s163795@NucleusC057 ~]$ conda list | grep networkx networkx 1.11 py37_1
(/project/CRI/Xu_lab/shared/softwares/neoloop) [s163795@NucleusC057 ~]$ conda list | grep pyensembl pyensembl 1.8.0 py_0 bioconda (/project/CRI/Xu_lab/shared/softwares/neoloop) [s163795@NucleusC057 ~]$ conda list | grep matplotl matplotlib 3.1.1 py37_2 conda-forge matplotlib-base 3.1.1 py37h250f245_2 conda-forge (/project/CRI/Xu_lab/shared/softwares/neoloop) [s163795@NucleusC057 ~]$ conda list | grep pybigwig pybigwig 0.3.17 py37hc013797_0 bioconda (/project/CRI/Xu_lab/shared/softwares/neoloop) [s163795@NucleusC057 ~]$ conda list | grep pomegranate pomegranate 0.10.0 py37hdd07704_0
(/project/CRI/Xu_lab/shared/softwares/neoloop) [s163795@NucleusC057 ~]$ conda list | grep rpy2 rpy2 2.9.4 py37mro351h6853232_0 r (/project/CRI/Xu_lab/shared/softwares/neoloop) [s163795@NucleusC057 ~]$ conda list | grep r-mgcv r-mgcv 1.8_23 mro351_0 r

jxulab commented 2 years ago

My .cool files were transformed from .hic files using hic2cool 0.8.3. My .cool files did contain "chr" prefix so I didn't run 'add_prefix_to_cool.py'. When I run calculate-cnv, it can finish without any problem. Not sure whether the version of hi2cool matters.

XiaoTaoWang commented 2 years ago

Hi, I just noticed this, which might be the reason: based on your previous error message, your NeoLoopFinder was installed at a local environment "/project/CRI/Xu_lab/shared/softwares/neoloop/lib/python3.7/site-packages/"; however, your networkx was imported from a public environment "/cm/shared/apps/python/3.7.x-anaconda/lib/python3.7/site-packages/". There seems to be two different networkx versions installed on your machine, but the public one has priority over the local one.

jxulab commented 2 years ago

Thank you for pointing out the issue. I am working with our IT support to fix this problem. Will let you know the updates.

jxulab commented 2 years ago

The issue was fixed. Thanks for your help!