Closed victorsanchezarevalo closed 1 month ago
Hi @victorsanchezarevalo,
decoupler
follows the observations x features
convention (commonly used in Python), rather than the features x observations
convention (more typical in R). You can transpose your matrix to match this format, and it should work. Feel free to reach out if you have any further questions!
Hi,
I’m encountering an issue when running ULM analysis with decoupler
. I have already transposed my expression matrix as recommended (observations x features), but I am still getting the following error related to the min_n=2
parameter:
Code used:
# Preparing matrix and transposing
mat = results_df[['stat']].T.rename(index={'stat': 'treatment.vs.control'}).T
# Transposing matrix so genes are in rows
print(f"New mat shape: {mat.shape}")
# Assigning gene names from subset_adata.var['gene_name']
mat.index = subset_adata.var['gene_name'].values
# Checking the first 10 gene names
print(mat.index[:10])
# Retrieving CollecTRI gene regulatory network
settings.setup(curl_timeout=1200)
os.system('rm -rf ~/.cache/omnipathdb/*')
os.system('rm -rf ~/.cache/pypath/*')
collectri = dc.get_collectri(organism='mouse', split_complexes=False)
# Running ULM analysis with min_n=2
tf_acts, tf_pvals = dc.run_ulm(mat=mat, net=collectri, verbose=True, min_n=2)
# Checking results
print(tf_acts.head())
print(tf_pvals.head())
Error message:
ValueError: No sources with more than min_n=2 targets. Make sure mat and net have shared target features or reduce the number assigned to min_n
I have verified that the matrix has been transposed and gene names have been correctly assigned, but the issue persists. It seems that no transcription factors have more than 2 shared targets, even though min_n=2
is a reasonable threshold in this context.
Any help or suggestions would be appreciated!
Thank you!
Could you show me the head of your input mat?
mat.head()
mat.head()
Out[20]:
treatment.vs.control
Sox17 -0.252628
Gm15452 0.320179
Gm26983 0.747064
Gm6187 -0.106062
Gm6119 -0.746242
Hi @victorsanchezarevalo,
As you show in your console output you have one observation (one contrast) and multiple genes (n). So, your matrix has wrong format features x observations
(n, 1), not the correct observations x features
(1, n). Transpose it again and it should be fine.
Thanks! Now works perfectly!
Describe the bug
I am encountering an error when using
Decoupler
'srun_ulm
function. The error states that no sources have more thanmin_n=2
targets, even though I have reduced themin_n
parameter, and my dataset should have sufficient shared targets between the matrix (mat
) and the network (collectri
).Error:
To Reproduce
Steps to reproduce the behavior:
decoupler
in a clean Python environment.run_ulm
function with the following setup:mat
): Gene expression matrix withn_genes
xn_samples
.collectri
): A list of transcription factors and their target genes.min_n=2
to ensure that there are at least 2 targets per transcription factor.filt_min_n
fails to find sufficient shared targets between the matrix and network.If needed, I can provide a subset of the data that triggers the error for testing purposes.
Expected behavior
I expected the
run_ulm
function to return transcription factor activities and p-values when using the provided gene expression matrix (mat
) and regulatory network (collectri
), as there should be sufficient shared target genes between the two.System
Additional context
I have verified that my matrix contains valid gene names and is compatible with the regulatory network. Despite lowering
min_n
to 1, the issue persists. This error seems to indicate a mismatch between the features in the matrix and the network, but these have been checked for consistency.