tanaylab / metacells

Metacells - Single-cell RNA Sequencing Analysis
MIT License
86 stars 8 forks source link

Key error "clean genes" #54

Closed SalimMegat closed 1 year ago

SalimMegat commented 1 year ago

Hi, Sorry but I am still having an issue with picking the right cells. Here the error I get when I run the command : mc.pl.analyze_clean_cells( full, properly_sampled_min_cell_total=PROPERLY_SAMPLED_MIN_CELL_TOTAL, properly_sampled_max_cell_total=PROPERLY_SAMPLED_MAX_CELL_TOTAL, properly_sampled_max_excluded_genes_fraction=PROPERLY_SAMPLED_MAX_EXCLUDED_GENES_FRACTION, )

KeyError Traceback (most recent call last) File :1

File ~/.conda/envs/metacells/lib/python3.8/site-packages/metacells/utilities/logging.py:373, in logged..wrap..wrapper(*args, kwargs) 368 if log_value is not None: 369 logger().log( 370 param_level, "%swith %s: %s", INDENT_SPACES[: 2 INDENT_LEVEL], name, log_value 371 ) --> 373 return function(args, kwargs) 375 finally: 376 if logger().isEnabledFor(step_level):

File ~/.conda/envs/metacells/lib/python3.8/site-packages/metacells/pipeline/clean.py:189, in analyze_clean_cells(adata, what, properly_sampled_min_cell_total, properly_sampled_max_cell_total, properly_sampled_max_excluded_genes_fraction) 187 excluded_adata: Optional[AnnData] = None 188 if properly_sampled_max_excluded_genes_fraction is not None: --> 189 excluded_genes = tl.filter_data(adata, name="dirty_genes", top_level=False, var_masks=["~clean_gene"]) 190 if excluded_genes is not None: 191 excluded_adata = excluded_genes[0]

File ~/.conda/envs/metacells/lib/python3.8/site-packages/metacells/utilities/logging.py:373, in logged..wrap..wrapper(*args, kwargs) 368 if log_value is not None: 369 logger().log( 370 param_level, "%swith %s: %s", INDENT_SPACES[: 2 INDENT_LEVEL], name, log_value 371 ) --> 373 return function(args, kwargs) 375 finally: 376 if logger().isEnabledFor(step_level):

File ~/.conda/envs/metacells/lib/python3.8/site-packages/metacells/tools/filter.py:94, in filter_data(adata, obs_masks, var_masks, mask_obs, mask_var, invert_obs, invert_var, track_obs, track_var, name, top_level) 92 ut.set_o_data(adata, mask_var, var_mask) 93 else: ---> 94 mask = combine_masks(adata, var_masks, invert=invert_var, to=mask_var) 95 if mask is None: 96 assert mask_var is not None

File ~/.conda/envs/metacells/lib/python3.8/site-packages/metacells/utilities/logging.py:373, in logged..wrap..wrapper(*args, kwargs) 368 if log_value is not None: 369 logger().log( 370 param_level, "%swith %s: %s", INDENT_SPACES[: 2 INDENT_LEVEL], name, log_value 371 ) --> 373 return function(args, kwargs) 375 finally: 376 if logger().isEnabledFor(step_level):

File ~/.conda/envs/metacells/lib/python3.8/site-packages/metacells/tools/mask.py:89, in combine_masks(adata, masks, invert, to) 87 else: 88 if must_exist: ---> 89 raise KeyError(f"unknown mask data: {mask_name}") 90 continue 92 if mask.dtype != "bool":

KeyError: 'unknown mask data: clean_gene'

Here the output of mc.pl.analyze_clean_genes : set als.var[properly_sampled_gene]: 17888 true (99.97%) out of 17893 bools set als.var[excluded_gene]: 13 true (0.07265%) out of 17893 bools set als.var[noisy_lonely_gene]: 0 true (0%) out of 17893 bools

Also, I am trying to compute the UMIs of exclude genes in each cells with the function of listed in the notebook but it does not seem to exist in this version of the package ?

Sorry for the harassment,

Many thanks,

Salim.

orenbenkiki commented 1 year ago

You can only analyze clean cells after creating a clean_gene mask, because part of the computation is taking into account the fraction of these genes UMIs out of the total UMIs.

Edit: It seems you are running v0.8? You should switch to 0.9, and there are vignettes describing the process for it.

SalimMegat commented 1 year ago

My bad I was using 0.8.0.... So sorry but thank you for the reactivity !

Best,

Salim.