tanaylab / metacells

Metacells - Single-cell RNA Sequencing Analysis
MIT License
86 stars 8 forks source link

Assertion error #53

Closed SalimMegat closed 1 year ago

SalimMegat commented 1 year ago

Hi ,

I am trying to clean up the data but I keep getting an assertion error but I do not understand why...

thanks for the help,

Best,

Salim.

set als.var[properly_sampled_gene]: 17888 true (99.97%) out of 17893 bools set als.var[excluded_gene]: 13 true (0.07265%) out of 17893 bools

AssertionError Traceback (most recent call last) File :1

File ~/.conda/envs/metacells/lib/python3.8/site-packages/metacells/utilities/logging.py:373, in logged..wrap..wrapper(*args, kwargs) 368 if log_value is not None: 369 logger().log( 370 param_level, "%swith %s: %s", INDENT_SPACES[: 2 INDENT_LEVEL], name, log_value 371 ) --> 373 return function(args, kwargs) 375 finally: 376 if logger().isEnabledFor(step_level):

File ~/.conda/envs/metacells/lib/python3.8/site-packages/metacells/pipeline/clean.py:101, in analyze_clean_genes(adata, what, properly_sampled_min_gene_total, noisy_lonely_max_sampled_cells, noisy_lonely_downsample_min_samples, noisy_lonely_downsample_min_cell_quantile, noisy_lonely_downsample_max_cell_quantile, noisy_lonely_min_gene_total, noisy_lonely_min_gene_normalized_variance, noisy_lonely_max_gene_similarity, excluded_gene_names, excluded_gene_patterns, random_seed) 98 else: 99 excluded_genes_mask = None --> 101 tl.find_noisy_lonely_genes( 102 adata, 103 what, 104 excluded_genes_mask=excluded_genes_mask, 105 max_sampled_cells=noisy_lonely_max_sampled_cells, 106 downsample_min_samples=noisy_lonely_downsample_min_samples, 107 downsample_min_cell_quantile=noisy_lonely_downsample_min_cell_quantile, 108 downsample_max_cell_quantile=noisy_lonely_downsample_max_cell_quantile, 109 min_gene_total=noisy_lonely_min_gene_total, 110 min_gene_normalized_variance=noisy_lonely_min_gene_normalized_variance, 111 max_gene_similarity=noisy_lonely_max_gene_similarity, 112 random_seed=random_seed, 113 )

File ~/.conda/envs/metacells/lib/python3.8/site-packages/metacells/utilities/logging.py:373, in logged..wrap..wrapper(*args, kwargs) 368 if log_value is not None: 369 logger().log( 370 param_level, "%swith %s: %s", INDENT_SPACES[: 2 INDENT_LEVEL], name, log_value 371 ) --> 373 return function(args, kwargs) 375 finally: 376 if logger().isEnabledFor(step_level):

File ~/.conda/envs/metacells/lib/python3.8/site-packages/metacells/tools/noisy_lonely.py:119, in find_noisy_lonely_genes(adata, what, excluded_genes_mask, max_sampled_cells, downsample_min_samples, downsample_min_cell_quantile, downsample_max_cell_quantile, min_gene_total, min_gene_normalized_variance, max_gene_similarity, random_seed, inplace) 116 else: 117 i_data = s_data --> 119 downsample_cells( 120 i_data, 121 what, 122 downsample_min_samples=downsample_min_samples, 123 downsample_min_cell_quantile=downsample_min_cell_quantile, 124 downsample_max_cell_quantile=downsample_max_cell_quantile, 125 random_seed=random_seed, 126 ) 128 find_high_total_genes(i_data, "downsampled", min_gene_total=min_gene_total) 130 results = filter_data( 131 i_data, name="high_total", top_level=False, track_var=track_var, var_masks=["high_total_gene"] 132 )

File ~/.conda/envs/metacells/lib/python3.8/site-packages/metacells/utilities/logging.py:373, in logged..wrap..wrapper(*args, kwargs) 368 if log_value is not None: 369 logger().log( 370 param_level, "%swith %s: %s", INDENT_SPACES[: 2 INDENT_LEVEL], name, log_value 371 ) --> 373 return function(args, kwargs) 375 finally: 376 if logger().isEnabledFor(step_level):

File ~/.conda/envs/metacells/lib/python3.8/site-packages/metacells/tools/downsample.py:97, in downsample_cells(adata, what, downsample_min_cell_quantile, downsample_min_samples, downsample_max_cell_quantile, random_seed, inplace) 94 ut.log_calc("samples", samples) 96 data = ut.get_vo_proper(adata, what, layout="row_major") ---> 97 assert ut.shaped_dtype(data) == "float32" 98 downsampled = ut.downsample_matrix(data, per="row", samples=samples, random_seed=random_seed) 99 if inplace:

AssertionError:

orenbenkiki commented 1 year ago

The current code insists that the data type of the matrix will be float32. This is weird, I know. We'll fix it when migrating to Daf.

Check the data type of the X of your data, and convert it to float32 (e.g. using numpy.astype). This should solve the problem.

SalimMegat commented 1 year ago

Yeah I actually figured that was an issue with the datatype... Many thanks !

Salim.