lutrarutra / deconv

Bulk RNA-seq Cell Type Proportion Deconvolution
4 stars 1 forks source link

TypeError: rank_marker_genes() got an unexpected keyword argument 'copy' and KeyError: 'mu_expression_labels' #4

Open WanderingHedgie opened 1 month ago

WanderingHedgie commented 1 month ago

Hi !

I want to report that the check_fit() function don't work, here is what I got when I used it :

TypeError                                 Traceback (most recent call last)
Cell In[11], [line 1]
----> [1] decon.check_fit()

File ~/Softwares/deconv/DeconV/DeconV.py:143, in DeconV.check_fit(self, path)
    [141] def check_fit(self, path=None):
    [142]   f, ax = plt.subplots(self.n_labels, self.n_labels, figsize=(20, 20), dpi=100)
--> [143]   res = tl.rank_marker_genes(self.adata, self.label_key, copy=True)
    [144]   for i in range(self.n_labels):
    [145]       for j in range(self.n_labels):

TypeError: rank_marker_genes() got an unexpected keyword argument 'copy'

I checked and there is no copy argument in this function, but it is used when scanpy rank_genes_groups() is called in DeconV tools.py :

def rank_marker_genes(adata, groupby, method="t-test", eps=None):
    rank_res = sc.tl.rank_genes_groups(
        adata, groupby=groupby, method=method, copy=True
    ).uns["rank_genes_groups"]

    adata.uns[f"rank_genes_{groupby}"] = {}

    for i, ref in enumerate(adata.obs[groupby].unique()):
        adata.uns["rank_genes_" + groupby][ref] = _rank_group(
            adata, rank_res, groupby, i, ref, eps
        )

Unortunately, removing the copy argument in rank_marker_genes() call doesn't fixed the problem :

Warning: some p-values (2) were 0, scattering them around 232.3
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[12], line 1
----> 1 decon.check_fit()

File ~/Softwares/deconv/DeconV/DeconV.py:143, in DeconV.check_fit(self, path)
    141 def check_fit(self, path=None):
    142     f, ax = plt.subplots(self.n_labels, self.n_labels, figsize=(20, 20), dpi=100)
--> 143     res = tl.rank_marker_genes(self.adata, self.label_key)  # , copy=True
    144     for i in range(self.n_labels):
    145         for j in range(self.n_labels):

File ~/Softwares/deconv/DeconV/tools.py:143, in rank_marker_genes(adata, groupby, method, eps)
    140 adata.uns[f"rank_genes_{groupby}"] = {}
    142 for i, ref in enumerate(adata.obs[groupby].unique()):
--> 143     adata.uns["rank_genes_" + groupby][ref] = _rank_group(
    144         adata, rank_res, groupby, i, ref, eps
    145     )

File ~/Softwares/deconv/DeconV/tools.py:58, in _rank_group(adata, rank_res, groupby, idx, ref_name, logeps)
     54 group_idx = adata.obs[groupby].astype("str").astype("category").cat.categories.tolist().index(ref_name)
     56 min_logfc, max_logfc = np.quantile(df["logFC"], [0.05, 0.95])
---> 58 df["mu_expression"] = adata.varm[f"mu_expression_{groupby}"][:, group_idx]
     59 df["log_mu_expression"] = adata.varm[f"log_mu_expression_{groupby}"][:, group_idx]
     60 assert df["log_mu_expression"].isna().any() == False

File ~/Environments/CONDAS/miniforge3/envs/deconv/lib/python3.10/site-packages/anndata/_core/aligned_mapping.py:148, in AlignedActualMixin.__getitem__(self, key)
    147 def __getitem__(self, key: str) -> V:
--> 148     return self._data[key]

KeyError: 'mu_expression_labels'

Can you help me please ?

lutrarutra commented 1 month ago

Hey, thanks for reaching out. Could you please try if it works when you remove lines 58-61 from tools.py? Also remove the copy=True from tl.rank_marker_genes.

WanderingHedgie commented 1 month ago

Hi !

I tested as you said with lines 58-61 out and as I did before with copy=true out for tl.rank_marker_genes call in check_fit function and here is what I got :

Warning: some p-values (2) were 0, scattering them around 232.3
Warning: some p-values (12) were 0, scattering them around 320.5
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[21], line 1
----> 1 decon.check_fit()

File ~/Softwares/deconv/DeconV/DeconV.py:146, in DeconV.check_fit(self, path)
    144 for i in range(self.n_labels):
    145     for j in range(self.n_labels):
--> 146         gene = res[f"{self.labels[i]} vs. rest"].sort_values("gene_score", ascending=False).index[0]
    147         ax[i,0].set_ylabel(gene)
    148         ax[self.n_labels-1,j].set_xlabel(self.labels[j])

TypeError: 'NoneType' object is not subscriptable
WanderingHedgie commented 1 month ago

I must clarify that I used example data pbmc3k provided and layer counts for reference fitting. I had the same issue with my own data but it's faster to test with example.

Maybe there are too much zeros in reference.

I tested with those parameters :

decon = dv.DeconV(
    adata, cell_type_key="labels",  # cell_type_key is the column key in adata.obs that holds the cell type annotations 
    dropout_type="separate",        # separate, shared, or None
    model_type="nb",                # Gamma, Beta, nb, lognormal, or static    
    device=device
)

decon.fit_reference(num_epochs=2000, lr=0.1, lrd=0.999, layer="X", fp_hack=True)

Here is what I got when I tried with adata.X matrix and your requests :

Warning: some p-values (2) were 0, scattering them around 232.3
Warning: some p-values (12) were 0, scattering them around 320.5
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[13], line 1
----> 1 decon.check_fit()

File ~/Softwares/deconv/DeconV/DeconV.py:146, in DeconV.check_fit(self, path)
    144 for i in range(self.n_labels):
    145     for j in range(self.n_labels):
--> 146         gene = res[f"{self.labels[i]} vs. rest"].sort_values("gene_score", ascending=False).index[0]
    147         ax[i,0].set_ylabel(gene)
    148         ax[self.n_labels-1,j].set_xlabel(self.labels[j])

TypeError: 'NoneType' object is not subscriptable
lutrarutra commented 1 month ago

I believe it's a bug I have missed. I will fix it over the weekend and get back to you.

WanderingHedgie commented 1 month ago

Ok, thank you

WanderingHedgie commented 1 month ago

I find the loss pretty high too (always in e+03): image

lutrarutra commented 1 month ago

I pushed an update to fix the bug. In fact, the bug was already fixed, I just had forgot to push it previously.

Regarding the loss being high, that's normal. There are alot of samples (n_genes x n_cells). As long as it goes down during training, it doesnt matter.

WanderingHedgie commented 1 month ago

Hi ! Ok, thank you for your explanation !

I tested again with same parameters as in the screenshot and here is what I got :

Warning: some p-values (2) were 0, scattering them around 232.3
Warning: some p-values (12) were 0, scattering them around 320.5
DE: added results to 'adata.uns['de']['labels']'
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[11], line 1
----> 1 decon.check_fit()

File ~/Softwares/deconv/DeconV/DeconV.py:146, in DeconV.check_fit(self, path)
    144 for i in range(self.n_labels):
    145     for j in range(self.n_labels):
--> 146         gene = res[f"{self.labels[i]} vs. rest"].sort_values("gene_score", ascending=False).index[0]
    147         ax[i,0].set_ylabel(gene)
    148         ax[self.n_labels-1,j].set_xlabel(self.labels[j])

TypeError: 'NoneType' object is not subscriptable
lutrarutra commented 1 month ago

I think this is because you removed the copy=True from tl.rank_marker_genes function. Add it back (or reclone the repository without your edits) and test again.