aertslab / pycisTopic

pycisTopic is a Python module to simultaneously identify cell states and cis-regulatory topics from single cell epigenomics data.
Other
58 stars 12 forks source link

The data appears to lie in a lower-dimensional subspace of the space #185

Open Melody-cell opened 2 weeks ago

Melody-cell commented 2 weeks ago

Hi, when running this qc:

!/home/wangmengj/software/anaconda3/envs/scenicplus/bin/pycistopic qc \
    --fragments /home/wangmengj/workspace/project04_scenicplus/a02_data_from_article/data/fragments.tsv.gz \
    --regions /home/wangmengj/workspace/project04_scenicplus/a02_data_from_article/outs/consensus_peak_calling/consensus_regions.bed \
    --tss /home/wangmengj/workspace/project04_scenicplus/a02_data_from_article/outs/qc/tss.bed \
    --output /home/wangmengj/workspace/project04_scenicplus/a02_data_from_article/outs/qc/10hpf_1

It always appear this error like follows:

Traceback (most recent call last):
  File "/home/wangmengj/software/anaconda3/envs/scenicplus/lib/python3.11/site-packages/scipy/stats/_kde.py", line 226, in __init__
    self.set_bandwidth(bw_method=bw_method)
  File "/home/wangmengj/software/anaconda3/envs/scenicplus/lib/python3.11/site-packages/scipy/stats/_kde.py", line 574, in set_bandwidth
    self._compute_covariance()
  File "/home/wangmengj/software/anaconda3/envs/scenicplus/lib/python3.11/site-packages/scipy/stats/_kde.py", line 586, in _compute_covariance
    self._data_cho_cov = linalg.cholesky(self._data_covariance,
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wangmengj/software/anaconda3/envs/scenicplus/lib/python3.11/site-packages/scipy/linalg/_decomp_cholesky.py", line 88, in cholesky
    c, lower = _cholesky(a, lower=lower, overwrite_a=overwrite_a, clean=True,
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wangmengj/software/anaconda3/envs/scenicplus/lib/python3.11/site-packages/scipy/linalg/_decomp_cholesky.py", line 36, in _cholesky
    raise LinAlgError("%d-th leading minor of the array is not positive "
numpy.linalg.LinAlgError: 2-th leading minor of the array is not positive definite

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/wangmengj/software/anaconda3/envs/scenicplus/bin/pycistopic", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/wangmengj/software/anaconda3/envs/scenicplus/lib/python3.11/site-packages/pycisTopic/cli/pycistopic.py", line 26, in main
    args.func(args)
  File "/home/wangmengj/software/anaconda3/envs/scenicplus/lib/python3.11/site-packages/pycisTopic/cli/subcommand/qc.py", line 233, in run_qc
    qc(
  File "/home/wangmengj/software/anaconda3/envs/scenicplus/lib/python3.11/site-packages/pycisTopic/cli/subcommand/qc.py", line 144, in qc
    ) = compute_qc_stats(
        ^^^^^^^^^^^^^^^^^
  File "/home/wangmengj/software/anaconda3/envs/scenicplus/lib/python3.11/site-packages/pycisTopic/qc.py", line 622, in compute_qc_stats
    pdf_values_for_duplication_ratio = compute_kde(
                                       ^^^^^^^^^^^^
  File "/home/wangmengj/software/anaconda3/envs/scenicplus/lib/python3.11/site-packages/pycisTopic/qc.py", line 217, in compute_kde
    for pdf_result in executor.map(compute_kde_part, test_data_unique_split_arrays):
  File "/home/wangmengj/software/anaconda3/envs/scenicplus/lib/python3.11/concurrent/futures/_base.py", line 619, in result_iterator
    yield _result_or_cancel(fs.pop())
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wangmengj/software/anaconda3/envs/scenicplus/lib/python3.11/concurrent/futures/_base.py", line 317, in _result_or_cancel
    return fut.result(timeout)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/wangmengj/software/anaconda3/envs/scenicplus/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/home/wangmengj/software/anaconda3/envs/scenicplus/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/home/wangmengj/software/anaconda3/envs/scenicplus/lib/python3.11/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wangmengj/software/anaconda3/envs/scenicplus/lib/python3.11/site-packages/pycisTopic/qc.py", line 210, in compute_kde_part
    return gaussian_kde(training_data)(test_data_unique_split_array)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wangmengj/software/anaconda3/envs/scenicplus/lib/python3.11/site-packages/scipy/stats/_kde.py", line 235, in __init__
    raise linalg.LinAlgError(msg) from e
numpy.linalg.LinAlgError: The data appears to lie in a lower-dimensional subspace of the space in which it is expressed. This has resulted in a singular data covariance matrix, which cannot be treated using the algorithms implemented in `gaussian_kde`. Consider performing principle component analysis / dimensionality reduction and using `gaussian_kde` with the transformed data.

Did anyone know how can I solve it?

ghuls commented 2 weeks ago

Use the polars_1xx branch or apply the following commit to your version: https://github.com/aertslab/pycisTopic/commit/7d9a9ccd19a87925ffee0d33139411e77d462e5a

It normally seems to happen when you have a fragments file without count per region (or when they are all the same (e.g. 1).

You can apply the patch above or change one (or more) of your fragments in your fragments file, to a differnent count value (e.g. 2 if all others where 1).

Melody-cell commented 1 week ago

Use the polars_1xx branch or apply the following commit to your version: 7d9a9cc

It normally seems to happen when you have a fragments file without count per region (or when they are all the same (e.g. 1).

You can apply the patch above or change one (or more) of your fragments in your fragments file, to a differnent count value (e.g. 2 if all others where 1).

Thank you so much for your patient reply. I truly appreciate the time and effort you took to help solve my problems. The problem has been solved now using your advice.

Melody-cell commented 1 week ago

Then I will close this issue.

Melody-cell commented 2 days ago

Use the polars_1xx branch or apply the following commit to your version: 7d9a9cc

It normally seems to happen when you have a fragments file without count per region (or when they are all the same (e.g. 1).

You can apply the patch above or change one (or more) of your fragments in your fragments file, to a differnent count value (e.g. 2 if all others where 1).

@ghuls @SeppeDeWinter Sorry, could you please tell me how to Use the polars_1xx branch and apply the following commit to your version: [7d9a9cc] I changed to polars_1xx like this, but it didn't work. image

Melody-cell commented 2 days ago

And appears this : image