Tian-Dechao / diffDomain

DiffDomain is a statistically sound method for detecting differential TADs between conditions
MIT License
11 stars 3 forks source link

Error when running diffDomain with .cool or .hic #16

Open RomeroMatt opened 5 months ago

RomeroMatt commented 5 months ago

Hello, Thanks for this new tool. I am attempting to run diffDomain using the command python diffdomain-py3/diffdomains.py dvsd multiple input/h9_merged_30_25kb_25000.cool input/smpc_merged_30_25kb_25000.cool input/h9_merged_30_25kb_normKR.bed --reso 25000 --ofile output/ --oprefix hPSC_vs_FetalSMPC --oprefixFig hPSC_vs_FetalSMPC --hicnorm KR but keep getting this error: `multiprocessing.pool.RemoteTraceback: """ Traceback (most recent call last): File "/data/anaconda3/envs/diffdomain/lib/python3.7/multiprocessing/pool.py", line 121, in worker result = (True, func(*args, kwds)) File "diffdomain-py3/diffdomains.py", line 59, in comp2domins_by_twtest_parallel fhic0=opts[''], fhic1=opts[''],min_nbin=int(opts['--min_nbin']),f=opts['--f']) File "/data/diffDomain/diffdomain-py3/utils.py", line 379, in comp2domins_by_twtest Diffmatnorm = normDiffbyMeanSD(D=Diffmat) File "/data/diffDomain/diffdomain-py3/utils.py", line 260, in normDiffbyMeanSD b[k] = np.max(val1) File "<__array_function__ internals>", line 6, in amax File "/data/anaconda3/envs/diffdomain/lib/python3.7/site-packages/numpy/core/fromnumeric.py", line 2755, in amax keepdims=keepdims, initial=initial, where=where) File "/data/anaconda3/envs/diffdomain/lib/python3.7/site-packages/numpy/core/fromnumeric.py", line 86, in _wrapreduction return ufunc.reduce(obj, axis, dtype, out, passkwargs) ValueError: zero-size array to reduction operation maximum which has no identity """

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "diffdomain-py3/diffdomains.py", line 76, in result.append(i.get()) File "/data/anaconda3/envs/diffdomain/lib/python3.7/multiprocessing/pool.py", line 657, in get raise self._value ValueError: zero-size array to reduction operation maximum which has no identity`

I installed using conda. I have tried both .mcool and cool files as well as .hic files. neither work. When using .hic files I receive the error: Traceback (most recent call last): File "diffdomain-py3/diffdomains.py", line 54, in <module> tadb = loadtads(opts['<bed>'], sep=opts['--sep'], chrnum=opts['--chrn'], min_nbin=int(opts['--min_nbin']), reso=int(opts['--reso'])) File "/data/diffDomain/diffdomain-py3/utils.py", line 44, in loadtads tadb.iloc[:,1:3] = tadb.iloc[:,1:3].astype(int) File "/data/anaconda3/envs/diffdomain/lib/python3.7/site-packages/pandas/core/generic.py", line 5815, in astype new_data = self._mgr.astype(dtype=dtype, copy=copy, errors=errors) File "/data/anaconda3/envs/diffdomain/lib/python3.7/site-packages/pandas/core/internals/managers.py", line 418, in astype return self.apply("astype", dtype=dtype, copy=copy, errors=errors) File "/data/anaconda3/envs/diffdomain/lib/python3.7/site-packages/pandas/core/internals/managers.py", line 327, in apply applied = getattr(b, f)(**kwargs) File "/data/anaconda3/envs/diffdomain/lib/python3.7/site-packages/pandas/core/internals/blocks.py", line 591, in astype new_values = astype_array_safe(values, dtype, copy=copy, errors=errors) File "/data/anaconda3/envs/diffdomain/lib/python3.7/site-packages/pandas/core/dtypes/cast.py", line 1309, in astype_array_safe new_values = astype_array(values, dtype, copy=copy) File "/data/anaconda3/envs/diffdomain/lib/python3.7/site-packages/pandas/core/dtypes/cast.py", line 1257, in astype_array values = astype_nansafe(values, dtype, copy=copy) File "/data/anaconda3/envs/diffdomain/lib/python3.7/site-packages/pandas/core/dtypes/cast.py", line 1095, in astype_nansafe result = astype_nansafe(flat, dtype, copy=copy, skipna=skipna) File "/data/anaconda3/envs/diffdomain/lib/python3.7/site-packages/pandas/core/dtypes/cast.py", line 1174, in astype_nansafe return lib.astype_intsafe(arr, dtype) File "pandas/_libs/lib.pyx", line 679, in pandas._libs.lib.astype_intsafe TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'

Not quite sure what I am doing wrong, but any help would be great! Thanks!

mingbao96 commented 5 months ago

Hi @RomeroMatt ,

I apologize for the inconvenience. It might be due to the environment yml file for conda not being updated in time. Could you please try installing the new dependencies using pip install diffDomain-py3 and then attempt again? If you happen to have any more issues, please don't hesitate to contact us.

Best wishes.

RomeroMatt commented 5 months ago

Thanks for the info, @mingbao96 ! I used the pip command and everything seemed to install except I did receive this note:

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. spyder 5.2.2 requires jellyfish>=0.7, which is not installed. spyder 5.2.2 requires pyqtwebengine<5.13, which is not installed. spyder 5.2.2 requires intervaltree>=3.0.2, but you have intervaltree 2.1.0 which is incompatible. spyder 5.2.2 requires pyqt5<5.13, but you have pyqt5 5.15.10 which is incompatible.

I then tried to run again, but received a similar error: multiprocessing.pool.RemoteTraceback: """ Traceback (most recent call last): File "/data/anaconda3/lib/python3.9/multiprocessing/pool.py", line 125, in worker result = (True, func(*args, kwds)) File "/data/diffDomain/diffdomain-py3/diffdomains.py", line 57, in comp2domins_by_twtest_parallel tmp_res = comp2domins_by_twtest(chrn=tadb.iloc[i, 0], start=tadb.iloc[i, 1], File "/data/diffDomain/diffdomain-py3/utils.py", line 379, in comp2domins_by_twtest Diffmatnorm = normDiffbyMeanSD(D=Diffmat) File "/data/diffDomain/diffdomain-py3/utils.py", line 260, in normDiffbyMeanSD b[k] = np.max(val1) File "<__array_function__ internals>", line 5, in amax File "/data/anaconda3/lib/python3.9/site-packages/numpy/core/fromnumeric.py", line 2754, in amax return _wrapreduction(a, np.maximum, 'max', axis, None, out, File "/data/anaconda3/lib/python3.9/site-packages/numpy/core/fromnumeric.py", line 86, in _wrapreduction return ufunc.reduce(obj, axis, dtype, out, passkwargs) ValueError: zero-size array to reduction operation maximum which has no identity """

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/data/diffDomain/diffdomain-py3/diffdomains.py", line 76, in result.append(i.get()) File "/data/anaconda3/lib/python3.9/multiprocessing/pool.py", line 771, in get raise self._value ValueError: zero-size array to reduction operation maximum which has no identity

I also didn't remove the conda environment so I'm not sure if that would cause an issue? After updating via pip, I ran both in and outside the conda environment. Not sure where I'm going wrong.

mingbao96 commented 5 months ago

No worries @RomeroMatt , You might want to try creating a new environment, then install Python 3 and use pip to install diffdomain. This approach may help resolve conflicts or issues you're experiencing. Please let me know how it goes or if you need further assistance!

RomeroMatt commented 5 months ago

Thank you for the suggestion. I removed my environment and attempted to create a new python environment and install via pip install diffDomain-py3, however i received the below error message:

error: subprocess-exited-with-error

× python setup.py egg_info did not run successfully. │ exit code: 1 ╰─> [3 lines of output] error in HiCMatrix setup command: 'install_requires' must be a string or list of strings containing valid project/version requirement specifiers; . suffix can only be used with == or != operators numpy >= 1.16.


      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

I then tried to install numpy independently and retry the pip command, but I kept receiving the same error. 

I apologize for the issues! 
mingbao96 commented 4 months ago

Hello, @RomeroMatt

It looks like you're encountering some issues with installing the HiCMatrix package. I suggest trying to install HiCMatrix using Conda by running the following command: conda install -c bioconda hicmatrix .

After successfully installing HiCMatrix, you can then proceed with the installation of DiffDomain. Please let me know if this resolves your issue or if you encounter any further problems.

No worries!

RomeroMatt commented 4 months ago

Thank you for the help. I ended up removing the DiffDomain package and the environment then re-installing the package and environment via conda. However, I removed the specifications for both DiffDomain and numpy in the environment_linux.yml file and this seemed to work. I am only able to run it using .hic files however and not .mcool nor .cool files. I am however able to use TAD lists from either Juicer or HicExplorer after adding a header to the HicExplorer TAD list.

RomeroMatt commented 4 months ago

A separate question - is there a way to overlay TAD data using the visualization option in DiffDomain? Similar to what you have in figure G) on the home page? Thanks!

Tian-Dechao commented 4 months ago

Current version of the visualization option does not support this feature. One option is feeding the reorganized TAD list and .hic files to Juicebox for overlaying TAD with heatmaps. Here is the link to the Juicebox tutorial