Open Erikado4 opened 3 days ago
Hi @Erikado4, I'm going to need a bit more information to be able to help you.
Could you fill in the bug template below?
Describe the bug A clear and concise description of what the bug is. NB: for questions about pydeseq2 that are not related to a bug, please open a topic on the scverse ecosystem Discourse forum.
To Reproduce Provide snippets of code and steps on how to reproduce the behavior. Please also specify the version you are using.
Expected behavior A clear and concise description of what you expected to happen.
Screenshots If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
Additional context Add any other context about the problem here.
I use the same code from the tutorial: [https://pydeseq2.readthedocs.io/en/latest/auto_examples/plot_pandas_io_example.html#sphx-glr-auto-examples-plot-pandas-io-example-py] Loading data and saving results with pandas and pickle
Except design="~condition" is no longer a recognized parameter so I changed it too design_factors="condition".
Then with the final dds.deseq2() it is stuck here indefinitely.
import os
import pickle as pkl
import pandas as pd
from pydeseq2.dds import DeseqDataSet
from pydeseq2.default_inference import DefaultInference
from pydeseq2.ds import DeseqStats
DATA_PATH = "https://raw.githubusercontent.com/owkin/PyDESeq2/main/datasets/synthetic/"
counts_df = pd.read_csv(os.path.join(DATA_PATH, "test_counts.csv"), index_col=0)
print(counts_df)
counts_df = counts_df.T
print(counts_df)
metadata = pd.read_csv(os.path.join(DATA_PATH, "test_metadata.csv"), index_col=0)
print(metadata)
genes_to_keep = counts_df.columns[counts_df.sum(axis=0) >= 10]
counts_df = counts_df[genes_to_keep]
inference = DefaultInference(n_cpus=8)
dds = DeseqDataSet(
counts=counts_df,
metadata=metadata,
design_factors="condition",
refit_cooks=True,
inference=inference,
)
dds.deseq2()
pydeseq2 0.4.12 Model: Precision 7960 Tower OS: Arch Linux x86_64 Kernel: 6.11.5-arch1-1 Shell: zsh
Update: I tried to run each step individually from the Step-by-Step tutorial (again with the synthetic data) and it seems to be getting stuck on dds.fit_genewise_dispersions()
Hi @Erikado4,
Thanks for providing the details.
I'm a bit lost here because I can't reproduce the issue on my machine (Mac OS). I'm assuming this bug has something to do with the fact you're using Arch Linux.
Given where the code is stuck, I think it's either the joblib
parallelization that is the problem, or fit_lin_mu
, which calls the linear regression from scikit learn (in which case the problem could come from low-level linear algebra librairies like BLAS, but this is a wild guess).
Could you try the code below and tell me if anything different happens?
import os
import pickle as pkl
import pandas as pd
from pydeseq2.dds import DeseqDataSet
from pydeseq2.default_inference import DefaultInference
from pydeseq2.ds import DeseqStats
DATA_PATH = "https://raw.githubusercontent.com/owkin/PyDESeq2/main/datasets/synthetic/"
counts_df = pd.read_csv(os.path.join(DATA_PATH, "test_counts.csv"), index_col=0)
print(counts_df)
counts_df = counts_df.T
print(counts_df)
metadata = pd.read_csv(os.path.join(DATA_PATH, "test_metadata.csv"), index_col=0)
print(metadata)
genes_to_keep = counts_df.columns[counts_df.sum(axis=0) >= 10]
counts_df = counts_df[genes_to_keep]
inference = DefaultInference(n_cpus=8, backend="threading")
dds = DeseqDataSet(
counts=counts_df,
metadata=metadata,
design_factors="condition",
refit_cooks=True,
inference=inference,
)
dds.deseq2()
It's the same thing with a different joblib backend (inference = DefaultInference(n_cpus=8, backend="threading")
).
Thanks for trying to help out! I get this error now
Everytime I DeseqDataSet using my own data (as counts/metadata dataframes or straight from my anndata object) or using the test data on the GettingStarted docs, I then try to run dds.deseq2() and the RAM shoots up and the kernel crashes.
The parameters for DeseqDataSet in the tutorials and the current version do not match.
Reallly would've liked to use this tool :(