Teichlab / celltypist

A tool for semi-automatic cell type classification
https://www.celltypist.org/
MIT License
278 stars 44 forks source link

error while plotting predictions: cannot find keys #22

Closed ersgupta closed 2 years ago

ersgupta commented 2 years ago

Hi,

I am trying to plot the predictions as a dotplot. It works for one dataset but does not seem to work for another. I have checked to make sure the respective columns exist in predictions.predicted_labels.

celltypist.dotplot(predictions, use_as_reference='seurat_clusters', use_as_prediction='majority_voting')

Error:

KeyError                                  Traceback (most recent call last)
----> 7 celltypist.dotplot(predictions, use_as_reference='seurat_clusters', use_as_prediction="majority_voting")
.../scanpy_py37/lib/python3.7/site-packages/celltypist/plot.py in dotplot(predictions, use_as_reference, use_as_prediction, prediction_order, reference_order, filter_prediction, cmap, vmin, vmax, colorbar_title, dot_min, dot_max, smallest_dot, size_title, swap_axes, title, figsize, show, save, ax, return_fig, **kwds)
    156     _adata.obs['_pred'] = dot_size_df.index
    157     #DotPlot
--> 158     dp = sc.pl.DotPlot(_adata, dot_size_df.columns, '_pred', title = title, figsize = figsize, dot_color_df = dot_color_df, dot_size_df = dot_size_df, ax = ax, vmin = vmin, vmax = vmax, **kwds)
    159     if swap_axes:
    160         dp.swap_axes()

.../scanpy_py37/lib/python3.7/site-packages/scanpy/plotting/_dotplot.py in __init__(self, adata, var_names, groupby, use_raw, log, num_categories, categories_order, title, figsize, gene_symbols, var_group_positions, var_group_labels, var_group_rotation, layer, expression_cutoff, mean_only_expressed, standard_scale, dot_color_df, dot_size_df, ax, vmin, vmax, vcenter, norm, **kwds)
    151             vcenter=vcenter,
    152             norm=norm,
--> 153             **kwds,
    154         )
    155 

.../scanpy_py37/lib/python3.7/site-packages/scanpy/plotting/_baseplot_class.py in __init__(self, adata, var_names, groupby, use_raw, log, num_categories, categories_order, title, figsize, gene_symbols, var_group_positions, var_group_labels, var_group_rotation, layer, ax, vmin, vmax, vcenter, norm, **kwds)
    117             num_categories,
    118             layer=layer,
--> 119             gene_symbols=gene_symbols,
    120         )
    121         if len(self.categories) > self.MAX_NUM_CATEGORIES:
.../scanpy_py37/lib/python3.7/site-packages/scanpy/plotting/_anndata.py in _prepare_dataframe(adata, var_names, groupby, use_raw, log, num_categories, layer, gene_symbols)
   1918     keys = list(groupby) + list(np.unique(var_names))
   1919     obs_tidy = get.obs_df(
-> 1920         adata, keys=keys, layer=layer, use_raw=use_raw, gene_symbols=gene_symbols
   1921     )
   1922     assert np.all(np.array(keys) == np.array(obs_tidy.columns))

.../scanpy_py37/lib/python3.7/site-packages/scanpy/get/get.py in obs_df(adata, keys, obsm_keys, layer, gene_symbols, use_raw)
    276         keys,
    277         alias_index=alias_index,
--> 278         use_raw=use_raw,
    279     )
    280 

.../scanpy_py37/lib/python3.7/site-packages/scanpy/get/get.py in _check_indices(dim_df, alt_index, dim, keys, alias_index, use_raw)
    166     if len(not_found) > 0:
    167         raise KeyError(
--> 168             f"Could not find keys '{not_found}' in columns of `adata.{dim}` or in"
    169             f" {alt_repr}.{alt_search_repr}."
    170         )

KeyError: "Could not find keys '['0', '1', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '2', '20', '21', '22', '23', '24', '25', '26', '27', '28', '29', '3', '30', '31', '32', '33', '34', '35', '36', '37', '38', '39', '4', '40', '41', '42', '43', '44', '45', '46', '47', '48', '5', '6', '7', '8', '9']' in columns of `adata.obs` or in adata.var_names."

Any help would be great.

Thanks Saurabh

ChuanXu1 commented 2 years ago

@ersgupta, I have no idea of this.

If you could try df1, df2 = celltypist.plot._get_fraction_prob_df(predictions, 'seurat_clusters', 'majority_voting'), and show me how df1 and df2 look like, that will help me spot the error source.

ersgupta commented 2 years ago

@ChuanXu1 , thanks. I ran the command with 'majority_voting' and 'predicted_labels'. df1 and df2 are panda dataframes with identical dimensions (23 49 (majority_voting) and 29 49 (predicted_labels)). 29 being the # clusters in ref and 49 in query datasets.

df1
refer        32        5         28        23        2         0         8   \
pred                                                                          
0      0.658318  0.606526  0.568102  0.556452  0.539310  0.535056  0.526742   
1      0.016100  0.026104  0.022119  0.030242  0.035672  0.033028  0.041329   
10     0.000000  0.002687  0.002328  0.002016  0.002582  0.002434  0.008914   
11     0.000000  0.000768  0.001164  0.000000  0.000469  0.001043  0.004457   
12     0.059034  0.027639  0.013970  0.025202  0.015724  0.043806  0.066045   
13     0.000000  0.000000  0.000000  0.000000  0.000000  0.000116  0.000000   
14     0.000000  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000   
15     0.000000  0.000384  0.039581  0.000000  0.000000  0.000116  0.000000   
16     0.000000  0.000000  0.000000  0.000000  0.000235  0.000116  0.000000   
17     0.000000  0.000000  0.000000  0.001008  0.000000  0.000116  0.000810   
18     0.001789  0.000384  0.003492  0.000000  0.000000  0.000116  0.000000   
19     0.007156  0.002303  0.005821  0.001008  0.002112  0.001738  0.009319   
2      0.005367  0.004607  0.005821  0.008065  0.008683  0.004404  0.003647   
20     0.000000  0.000384  0.000000  0.000000  0.000000  0.000116  0.001621   
21     0.000000  0.000384  0.002328  0.002016  0.000469  0.000811  0.029173   
22     0.007156  0.001919  0.003492  0.002016  0.002816  0.004520  0.006483   
23     0.000000  0.000384  0.001164  0.001008  0.001173  0.001275  0.003241   
24     0.000000  0.000000  0.000000  0.000000  0.000000  0.000232  0.000405   
25     0.000000  0.000000  0.000000  0.000000  0.000000  0.000695  0.001621   
26     0.000000  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000   
27     0.000000  0.000000  0.000000  0.000000  0.000000  0.000116  0.000000   
28     0.000000  0.000000  0.000000  0.000000  0.000000  0.000000  0.000405   
3      0.028623  0.063340  0.050058  0.050403  0.075334  0.054815  0.047407   
4      0.042934  0.112092  0.144354  0.117944  0.106548  0.178932  0.136953   
5      0.033989  0.059885  0.065192  0.052419  0.046233  0.061073  0.043760   
6      0.000000  0.001536  0.000000  0.001008  0.000704  0.000348  0.001621   
7      0.000000  0.000000  0.000000  0.000000  0.000469  0.000116  0.000000   
8      0.001789  0.000384  0.003492  0.002016  0.004928  0.002086  0.008104   
9      0.137746  0.088292  0.067520  0.147177  0.156536  0.072778  0.057942   

refer        3         16        15  ...   42        30        22        38  \
pred                                 ...                                      
0      0.455960  0.446793  0.404225  ...  0.0  0.002503  0.051456  0.010101   
1      0.062247  0.029883  0.046479  ...  0.0  0.000000  0.012621  0.000000   
10     0.014317  0.001458  0.000704  ...  0.0  0.002503  0.025243  0.015152   
11     0.002801  0.001458  0.000000  ...  0.0  0.000000  0.000000  0.000000   
12     0.039216  0.034257  0.013380  ...  0.0  0.002503  0.001942  0.000000   
13     0.001245  0.000000  0.000000  ...  0.0  0.001252  0.000971  0.000000   
14     0.000000  0.000000  0.000000  ...  0.0  0.000000  0.000000  0.000000   
15     0.000622  0.000729  0.000000  ...  0.0  0.000000  0.000000  0.000000   
16     0.000622  0.000000  0.000000  ...  0.0  0.000000  0.000000  0.000000   
17     0.000000  0.000000  0.000000  ...  0.0  0.000000  0.054369  0.000000   
18     0.000000  0.000000  0.000000  ...  0.0  0.000000  0.000000  0.000000   
19     0.005291  0.006560  0.100704  ...  0.0  0.001252  0.000971  0.005051   
2      0.009337  0.007289  0.007042  ...  0.0  0.002503  0.308738  0.000000   
20     0.001556  0.000000  0.000000  ...  0.0  0.001252  0.001942  0.000000   
21     0.017429  0.002187  0.001408  ...  0.0  0.007509  0.001942  0.000000   
22     0.006536  0.024052  0.004225  ...  0.0  0.000000  0.003883  0.000000   
23     0.002490  0.000729  0.000000  ...  0.0  0.000000  0.000971  0.000000   
24     0.000311  0.000000  0.000000  ...  1.0  0.947434  0.000000  0.000000   
25     0.000311  0.000000  0.000000  ...  0.0  0.000000  0.000000  0.000000   
26     0.000000  0.000000  0.000000  ...  0.0  0.000000  0.484466  0.000000   
27     0.000000  0.000000  0.000000  ...  0.0  0.000000  0.000000  0.000000   
28     0.000000  0.000000  0.000000  ...  0.0  0.000000  0.000000  0.964646   
3      0.066293  0.042274  0.059859  ...  0.0  0.005006  0.010680  0.005051   
4      0.114535  0.214286  0.201408  ...  0.0  0.001252  0.012621  0.000000   
5      0.050731  0.062682  0.074648  ...  0.0  0.002503  0.005825  0.000000   
6      0.001245  0.000000  0.000000  ...  0.0  0.001252  0.000971  0.000000   
7      0.000622  0.000000  0.000000  ...  0.0  0.013767  0.000000  0.000000   
8      0.010271  0.003644  0.000000  ...  0.0  0.003755  0.007767  0.000000   
9      0.136010  0.121720  0.085915  ...  0.0  0.003755  0.012621  0.000000   

refer        6         36        21        34        9       20  
pred                                                             
0      0.132435  0.000000  0.037847  0.006494  0.010412  0.0192  
1      0.025377  0.000000  0.009251  0.000000  0.003037  0.0008  
10     0.010309  0.000000  0.015980  0.025974  0.035575  0.0304  
11     0.000793  0.000000  0.000841  0.087662  0.000000  0.0000  
12     0.097938  0.000000  0.004205  0.006494  0.000000  0.0008  
13     0.000000  0.000000  0.010934  0.003247  0.000000  0.0000  
14     0.000000  0.000000  0.000000  0.003247  0.000000  0.0000  
15     0.000397  0.000000  0.000000  0.000000  0.000434  0.0000  
16     0.000397  0.000000  0.001682  0.055195  0.000000  0.0000  
17     0.000000  0.000000  0.000000  0.000000  0.000000  0.0000  
18     0.011102  0.000000  0.000000  0.103896  0.000000  0.0000  
19     0.000793  0.000000  0.000841  0.009740  0.001735  0.0016  
2      0.010706  0.003846  0.014298  0.003247  0.001735  0.0016  
20     0.000000  0.000000  0.013457  0.012987  0.000000  0.0000  
21     0.001983  0.000000  0.025231  0.074675  0.000000  0.0000  
22     0.020222  0.000000  0.000841  0.003247  0.000000  0.0000  
23     0.000397  0.000000  0.000000  0.000000  0.000000  0.0000  
24     0.000000  0.003846  0.127839  0.000000  0.000000  0.0000  
25     0.000000  0.000000  0.000000  0.009740  0.000000  0.0000  
26     0.000000  0.000000  0.000000  0.000000  0.000000  0.0000  
27     0.000000  0.000000  0.000000  0.000000  0.000000  0.0000  
28     0.000397  0.000000  0.000000  0.000000  0.000000  0.0000  
3      0.052736  0.000000  0.013457  0.003247  0.003037  0.0024  
4      0.519429  0.000000  0.005887  0.012987  0.003905  0.0064  
5      0.082474  0.000000  0.003364  0.000000  0.000434  0.0008  
6      0.000397  0.992308  0.000841  0.000000  0.000434  0.0000  
7      0.000397  0.000000  0.708158  0.577922  0.000000  0.0000  
8      0.004362  0.000000  0.002523  0.000000  0.937093  0.9312  
9      0.026963  0.000000  0.002523  0.000000  0.002169  0.0048

df2
refer        32        5         28        23        2         0         8   \
pred                                                                          
0      0.965809  0.948292  0.944483  0.947334  0.922767  0.948084  0.898341   
1      0.886550  0.860238  0.845232  0.849214  0.831936  0.880587  0.829372   
10     0.000000  0.786719  0.839367  0.359204  0.843823  0.691223  0.861046   
11     0.000000  0.968777  0.969953  0.000000  0.766236  0.899411  0.683387   
12     0.986399  0.928306  0.892552  0.871506  0.899762  0.936753  0.925805   
13     0.000000  0.000000  0.000000  0.000000  0.000000  0.987605  0.000000   
14     0.000000  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000   
15     0.000000  0.232390  0.876529  0.000000  0.000000  0.897506  0.000000   
16     0.000000  0.000000  0.000000  0.000000  0.355944  0.124632  0.000000   
17     0.000000  0.000000  0.000000  0.960677  0.000000  0.985546  0.774046   
18     0.820819  0.961904  0.728207  0.000000  0.000000  0.500547  0.000000   
19     0.932710  0.860029  0.924264  0.614199  0.778727  0.885887  0.784612   
2      0.848432  0.601666  0.631763  0.765619  0.650762  0.776042  0.546454   
20     0.000000  0.198825  0.000000  0.000000  0.000000  0.382079  0.917699   
21     0.000000  0.982235  0.892606  0.760580  0.960442  0.830323  0.829415   
22     0.946244  0.806765  0.742423  0.956148  0.883609  0.898583  0.869285   
23     0.000000  0.982128  0.750404  0.996796  0.875415  0.843705  0.798186   
24     0.000000  0.000000  0.000000  0.000000  0.000000  0.977217  0.940842   
25     0.000000  0.000000  0.000000  0.000000  0.000000  0.783579  0.802243   
26     0.000000  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000   
27     0.000000  0.000000  0.000000  0.000000  0.000000  0.863327  0.000000   
28     0.000000  0.000000  0.000000  0.000000  0.000000  0.000000  0.993291   
3      0.910973  0.826976  0.818633  0.747797  0.764451  0.835207  0.763416   
4      0.923150  0.914572  0.937952  0.901560  0.871839  0.915423  0.879200   
5      0.949060  0.884110  0.850637  0.885552  0.837089  0.885396  0.812224   
6      0.000000  0.959709  0.000000  0.991092  0.559784  0.916381  0.572463   
7      0.000000  0.000000  0.000000  0.000000  0.525894  0.842679  0.000000   
8      0.921094  0.822523  0.954555  0.992354  0.794097  0.793627  0.894249   
9      0.958550  0.941142  0.923955  0.947293  0.948092  0.941160  0.930160   

refer        3         16        15  ...       42        30        22  \
pred                                 ...                                
0      0.902354  0.933878  0.941323  ...  0.00000  0.999953  0.903707   
1      0.866204  0.882260  0.834244  ...  0.00000  0.000000  0.893350   
10     0.863571  0.709540  0.468696  ...  0.00000  0.980372  0.954141   
11     0.728036  0.944513  0.000000  ...  0.00000  0.000000  0.000000   
12     0.920262  0.931447  0.852006  ...  0.00000  0.999853  0.545751   
13     0.746761  0.000000  0.000000  ...  0.00000  1.000000  0.705615   
14     0.000000  0.000000  0.000000  ...  0.00000  0.000000  0.000000   
15     0.810965  0.992943  0.000000  ...  0.00000  0.000000  0.000000   
16     0.509520  0.000000  0.000000  ...  0.00000  0.000000  0.000000   
17     0.000000  0.000000  0.000000  ...  0.00000  0.000000  0.959697   
18     0.000000  0.000000  0.000000  ...  0.00000  0.000000  0.000000   
19     0.718166  0.848843  0.810799  ...  0.00000  1.000000  0.999774   
2      0.653567  0.541287  0.797159  ...  0.00000  0.649194  0.876038   
20     0.692546  0.000000  0.000000  ...  0.00000  0.981132  0.907484   
21     0.753680  0.926417  0.703139  ...  0.00000  0.647112  0.860861   
22     0.922892  0.956881  0.864125  ...  0.00000  0.000000  0.962811   
23     0.689861  0.773163  0.000000  ...  0.00000  0.000000  0.801867   
24     0.945073  0.000000  0.000000  ...  0.99946  0.970967  0.000000   
25     0.999784  0.000000  0.000000  ...  0.00000  0.000000  0.000000   
26     0.000000  0.000000  0.000000  ...  0.00000  0.000000  0.951527   
27     0.000000  0.000000  0.000000  ...  0.00000  0.000000  0.000000   
28     0.000000  0.000000  0.000000  ...  0.00000  0.000000  0.000000   
3      0.791107  0.849579  0.794963  ...  0.00000  0.763061  0.703529   
4      0.854674  0.925256  0.900698  ...  0.00000  0.592713  0.801051   
5      0.808756  0.878435  0.886966  ...  0.00000  0.682470  0.933008   
6      0.673831  0.000000  0.000000  ...  0.00000  0.976354  0.997900   
7      0.532620  0.000000  0.000000  ...  0.00000  0.922624  0.000000   
8      0.835278  0.882204  0.000000  ...  0.00000  0.995094  0.932741   
9      0.941952  0.970117  0.915104  ...  0.00000  0.782747  0.985773   

refer        38        6         36        21        34        9         20  
pred                                                                         
0      0.978322  0.913048  0.000000  0.648222  0.471485  0.791584  0.936011  
1      0.000000  0.927497  0.000000  0.603467  0.000000  0.761626  0.994911  
10     0.986820  0.937392  0.000000  0.525245  0.692694  0.895098  0.921507  
11     0.000000  0.926083  0.000000  0.099093  0.856720  0.000000  0.000000  
12     0.000000  0.966130  0.000000  0.971357  0.850548  0.000000  0.292844  
13     0.000000  0.000000  0.000000  0.837263  1.000000  0.000000  0.000000  
14     0.000000  0.000000  0.000000  0.000000  0.819100  0.000000  0.000000  
15     0.000000  0.999997  0.000000  0.000000  0.000000  0.076491  0.000000  
16     0.000000  0.224512  0.000000  0.974763  0.998662  0.000000  0.000000  
17     0.000000  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000  
18     0.000000  0.867637  0.000000  0.000000  0.999613  0.000000  0.000000  
19     1.000000  0.904442  0.000000  0.997380  0.948926  0.984928  0.701797  
2      0.000000  0.755984  0.741267  0.507369  0.515607  0.633443  0.446354  
20     0.000000  0.000000  0.000000  0.934429  0.996754  0.000000  0.000000  
21     0.000000  0.862564  0.000000  0.484563  0.811229  0.000000  0.000000  
22     0.000000  0.897069  0.000000  0.161369  0.876750  0.000000  0.000000  
23     0.000000  0.899411  0.000000  0.000000  0.000000  0.000000  0.000000  
24     0.000000  0.000000  0.867081  0.840746  0.000000  0.000000  0.000000  
25     0.000000  0.000000  0.000000  0.000000  0.963699  0.000000  0.000000  
26     0.000000  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000  
27     0.000000  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000  
28     0.968427  0.557817  0.000000  0.000000  0.000000  0.000000  0.000000  
3      0.545563  0.857892  0.000000  0.739810  0.624453  0.595503  0.698857  
4      0.000000  0.961181  0.000000  0.756561  0.974402  0.740595  0.778212  
5      0.000000  0.919500  0.000000  0.952483  0.000000  0.134070  0.749154  
6      0.000000  0.429840  0.957648  0.998860  0.000000  0.196596  0.000000  
7      0.000000  0.956970  0.000000  0.913621  0.960916  0.000000  0.000000  
8      0.000000  0.897939  0.000000  0.388946  0.000000  0.986432  0.990842  
9      0.000000  0.929880  0.000000  0.740042  0.000000  0.904171  0.919951
ChuanXu1 commented 2 years ago

@ersgupta, in an AnnData, clusters are usually assumed to be string ("1" rather than 1). Your seurat_clusters is type of int. You can change type of your seurat_clusters by adata.obs['seurat_clusters'] = adata.obs['seurat_clusters'].astype(str) or adata.obs['seurat_clusters'] = adata.obs['seurat_clusters'].astype('category') before using celltypist.annotate to get the predictions.

ChuanXu1 commented 2 years ago

I have committed a change relating to this dbfc3d13c3da8c47bade3a24bb640f947753abce. Now such int-type plot should be possible in (next version of) CellTypist.

ersgupta commented 2 years ago

Thanks a lot. It worked just fine.