LouisFaure / scFates

a scalable python suite for tree inference and advanced pseudotime analysis from scRNAseq data.
https://scfates.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
49 stars 1 forks source link

Testing fit: KeyError: 'de_p' #11

Closed wbrett87 closed 7 months ago

wbrett87 commented 1 year ago

Hello,

First, thanks for making such a good library. I'm at the point in my data where I am testing fit, but I keep on getting the following error. Any Ideas?


KeyError Traceback (most recent call last) Cell In[75], line 1 ----> 1 scf.tl.test_fork(rna,root_milestone="Basal Cell",milestones=["Secretory Cell","Ciliated","Scgb3a1 Positive Secretory"],n_jobs=20,rescale=True)

File /opt/miniconda3/envs/scFates/lib/python3.8/site-packages/scFates/tools/bifurcation_tools.py:190, in test_fork(adata, root_milestone, milestones, features, rescale, layer, n_jobs, n_map, copy) 188 stat = pd.concat(stat, axis=1).T 189 stat.index = genes --> 190 stat.columns = [dct_rev[c] for c in stat.columns[: len(milestones)]] + ["de_p"] 192 fork_stat = fork_stat + [stat] 194 topleave = fork_stat[m].iloc[:, :-1].idxmax(axis=1).apply(lambda mil: dct[mil])

File /opt/miniconda3/envs/scFates/lib/python3.8/site-packages/scFates/tools/bifurcation_tools.py:190, in (.0) 188 stat = pd.concat(stat, axis=1).T 189 stat.index = genes --> 190 stat.columns = [dct_rev[c] for c in stat.columns[: len(milestones)]] + ["de_p"] 192 fork_stat = fork_stat + [stat] 194 topleave = fork_stat[m].iloc[:, :-1].idxmax(axis=1).apply(lambda mil: dct[mil])

KeyError: 'de_p'

LouisFaure commented 1 year ago

Hi, unfortunately, I am not able to reproduce that error. This object stat should contain columns with amplitudes and p values from the GAM fit test, the latter is saved as 'de_p' in a specialised function (res saved as stat later on) here: https://github.com/LouisFaure/scFates/blob/49f69a757dc32e38a9c1e3e5bd0c7edcd47b3e1e/scFates/tools/bifurcation_tools.py#L329

Somehow that column is not created, which could be because GAM fit did not occur.

Do you also see that error when reproducing the tutorial example?

Could you try to see if mgcv is working by running this:

from rpy2.robjects.packages import importr
rmgcv = importr("mgcv")
wbrett87 commented 1 year ago

I do not get the error when running the tutorial example. It must be something about my data. I loaded my 10x Multiomic Data into a mudata file and then assigned the RNA data to a seperate variable called "rna". I then preprocessed the data following the basic scanpy tutorial. Maybe it has something to do with the mudata?

I'll check mgcv later tonight

LouisFaure commented 1 year ago

I never worked with the mudata object, so out of curiosity I just did a quick test using that muon tutorial, and I successfully managed to run up until scf.tl.test_fork. It makes sense since the extracted rna object is an usual anndata object.

Here is the code I used: https://gist.github.com/LouisFaure/bfaedbf7a99ff988bcd3193d748b4d5c (I am not sure if intermediate steps are reproducible though)

willey2020 commented 1 year ago

Hi @wbrett87 @LouisFaure, I have met the exact same issue of KeyError: 'de_p',

I tried to find out what happened, and it turns out that when I just do the action of 1 root+2 tips only, and they are the neighbor root and tips(bif>A+B) the function works; however, if I do the action of 1 root+3 tips (like one late tip in the same trajectory), then it turns that error.

Does this make sense for this error? Thanks!

LouisFaure commented 1 year ago

Hi @wbrett87, my guess would be huge difference or non-overlapping pseudotime between branches, but I would only be able to check that if I can get an anonymised version anndata so I can reproduce the issue.