tanaylab / metacells

Metacells - Single-cell RNA Sequencing Analysis
MIT License
86 stars 8 forks source link

No explaination for AttributeError in mc.pl.divide_and_conquer_pipeline #42

Closed zvittorio closed 1 year ago

zvittorio commented 1 year ago

Dear metacells team,

First of all thank you for bringing metacell to python! I am trying to run the tool on a rather small dataset just to test it out. The anndata object that I import has already undergone quality control and cell type annotation. So I did not perform any further cleaning of either cells of genes. I followed the tutorial in here, producing the "full" anndata object and then "clean". "Clean" looks like this before running mc.pl.divide_and_conquer_pipeline

AnnData object with n_obs × n_vars = 18080 × 4988
    obs: 'Study', 'Sample', 'Author_Annot', 'percent_mito', '_scvi_batch', '_scvi_labels', 'properly_sampled_cell', 'clean_cell', 'full_cell_index', 'cells_rare_gene_module', 'rare_cell', 'pre_cell_directs', 'cell_directs', 'pre_pile', 'pile', 'pre_candidate', 'candidate', 'pre_cell_deviant_votes', 'cell_deviant_votes', 'pre_dissolved', 'dissolved', 'pre_metacell', 'metacell', 'outlier'
    var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm', 'highly_variable_nbatches', 'highly_variable_intersection', 'properly_sampled_gene', 'excluded_gene', 'noisy_lonely_gene', 'clean_gene', 'full_gene_index', 'rare_gene', 'rare_gene_module', 'pre_high_total_gene', 'high_total_gene', 'pre_high_relative_variance_gene', 'high_relative_variance_gene', 'forbidden_gene', 'pre_feature_gene', 'feature_gene', 'pre_gene_deviant_votes', 'gene_deviant_votes', 'related_genes_module'
    uns: '__name__', 'pre_directs', 'directs'
    varp: 'related_genes_similarity'

Then, I run divide_and_conquer_pipeline in this way:

with mc.ut.progress_bar():
    mc.pl.divide_and_conquer_pipeline(clean,
                                      #forbidden_gene_names=forbidden_gene_names,
                                      target_metacell_size=1e4,
                                      random_seed=123)

as outlined in the tutorial. But I get an AttributeError with no message at all, after the computation of metacells runs up until 53% of the process.

Compute metacells for rare gene modules...
Compute common metacells...
 53%|█████▎    [00:27]
--------------------------------------------------------------------------- 

AssertionError                            Traceback (most recent call last)
[/home/vittorio/Desktop](https://file+.vscode-resource.vscode-cdn.net/home/vittorio) Loom[/metacells_MiniMeta.ipynb](https://file+.vscode-resource.vscode-cdn.net/metacells_MiniMeta.ipynb) Cell 23 in ()
      [1](vscode-notebook-cell:/home/vittorio/Desktop) with mc.ut.progress_bar():
----> [2](vscode-notebook-cell:/home/vittorio/Desktop)     mc.pl.divide_and_conquer_pipeline(clean,
      [3](vscode-notebook-cell:/home/vittorio/Desktop/)                                       #forbidden_gene_names=forbidden_gene_names,
      [4](vscode-notebook-cell:/home/vittorio/Desktop)                                       target_metacell_size=1e4,
      [5](vscode-notebook-cell:/home/vittorio/Desktop)                                       random_seed=123)

File [~/.local/lib/python3.8/site-packages/metacells/utilities/logging.py:373](https://file+.vscode-resource.vscode-cdn.net/home/vittorio/Desktop/~/.local/lib/python3.8/site-packages/metacells/utilities/logging.py:373), in logged..wrap..wrapper(*args, **kwargs)
    368             if log_value is not None:
    369                 logger().log(
    370                     param_level, "%swith %s: %s", INDENT_SPACES[: 2 * INDENT_LEVEL], name, log_value
    371                 )
--> 373     return function(*args, **kwargs)
    375 finally:
    376     if logger().isEnabledFor(step_level):
File [~/.local/lib/python3.8/site-packages/metacells/pipeline/divide_and_conquer.py:908](https://file+.vscode-resource.vscode-cdn.net/home/vittorio/Desktop/.local/lib/python3.8/site-packages/metacells/pipeline/divide_and_conquer.py:908), in divide_and_conquer_pipeline(adata, what, rare_max_gene_cell_fraction, rare_min_gene_maximum, rare_genes_similarity_method, rare_genes_cluster_method, rare_min_genes_of_modules, rare_min_cells_of_modules, rare_min_modules_size_factor, rare_min_module_correlation, rare_min_related_gene_fold_factor, rare_max_related_gene_increase_factor, rare_min_cell_module_total, rare_max_cells_factor_of_random_pile, rare_dissolve_min_robust_size_factor, rare_dissolve_min_convincing_size_factor, rare_dissolve_min_convincing_gene_fold_factor, feature_downsample_min_samples, feature_downsample_min_cell_quantile, feature_downsample_max_cell_quantile, feature_min_gene_total, feature_min_gene_top3, feature_min_gene_relative_variance, feature_gene_names, feature_gene_patterns, forbidden_gene_names, forbidden_gene_patterns, feature_correction, cells_similarity_value_normalization, cells_similarity_log_data, cells_similarity_method, groups_similarity_log_data, groups_similarity_method, min_target_pile_size, max_target_pile_size, target_metacells_in_pile, max_cell_size, max_cell_size_factor, cell_sizes, pile_min_split_size_factor, pile_min_robust_size_factor, pile_max_merge_size_factor, target_metacell_size, knn_k, min_knn_k, knn_balanced_ranks_factor, knn_incoming_degree_factor, knn_outgoing_degree_factor, min_seed_size_quantile, max_seed_size_quantile, candidates_cooldown_pass, candidates_cooldown_node, candidates_cooldown_phase, candidates_min_split_size_factor, candidates_max_merge_size_factor, candidates_min_metacell_cells, candidates_max_split_min_cut_strength, candidates_min_cut_seed_cells, must_complete_cover, final_max_outliers_levels, deviants_min_gene_fold_factor, deviants_abs_folds, deviants_max_gene_fraction, deviants_max_cell_fraction, dissolve_min_robust_size_factor, dissolve_min_convincing_size_factor, dissolve_min_convincing_gene_fold_factor, dissolve_min_metacell_cells, random_seed)
    905     logger.info("Compute common metacells...")
    906     logger.setLevel(log_level)
--> 908 is_divide_and_conquer = compute_divide_and_conquer_metacells(
    909     cdata,
    910     what,
    911     feature_downsample_min_samples=feature_downsample_min_samples,
...
    151 PROGRESS_END = PROGRESS_POSITION + PROGRESS_SIZE
--> 152 assert PROGRESS_END <= old_progress_end
    154 return old_progress_size, old_progress_end

AssertionError:

Without providing any direction on how to solve the problem. Thank you if you can provide any help,

Vittorio

UPDATE: I must have payed closer attention. The last traceback clearly shows that the problem has something to do with the progress bar. When I removed the with statement for the progress bar, the function runs to completion and prints all the metacells metrics in adata.var Still, it is weird that the progress bar function from the tutorial breaks. I will leave it up to the developing team then :)

Session information:

-----
anndata     0.8.0
scanpy      1.9.1
-----
PIL                 7.0.0
apport_python_hook  NA
asttokens           NA
backcall            0.2.0
beta_ufunc          NA
binom_ufunc         NA
cairo               1.16.2
cffi                1.15.0
colorama            0.4.3
cvxpy               1.3.1
cycler              0.10.0
cython_runtime      NA
dateutil            2.8.2
debugpy             1.6.0
decorator           5.1.1
ecos                2.0.12
entrypoints         0.3
executing           0.8.3
google              NA
h5py                3.7.0
hypergeom_ufunc     NA
icu                 2.4.2
igraph              0.9.10
ipykernel           6.13.0
ipywidgets          8.0.2
jedi                0.18.1
joblib              1.1.0
kiwisolver          1.4.2
leidenalg           0.8.10
llvmlite            0.38.1
louvain             0.7.1
matplotlib          3.5.2
matplotlib_inline   NA
metacells           0.8.0
mpl_toolkits        NA
natsort             8.1.0
nbinom_ufunc        NA
numba               0.55.2
numpy               1.22.4
osqp                0.6.2.post8
packaging           20.3
pandas              1.4.2
parso               0.8.3
pexpect             4.6.0
pickleshare         0.7.5
pkg_resources       NA
prompt_toolkit      3.0.29
psutil              5.5.1
ptyprocess          0.6.0
pure_eval           0.2.2
pycparser           2.21
pydev_ipython       NA
pydevconsole        NA
pydevd              2.8.0
pydevd_file_utils   NA
pydevd_plugins      NA
pydevd_tracing      NA
pygments            2.12.0
pynndescent         0.5.7
pyparsing           2.4.6
pytz                2022.1
qdldl               NA
scipy               1.8.1
scs                 3.2.2
seaborn             0.11.2
session_info        1.0.0
setuptools          67.6.1
setuptools_scm      NA
sitecustomize       NA
six                 1.14.0
sklearn             1.1.1
stack_data          0.2.0
statsmodels         0.13.2
swig_runtime_data4  NA
texttable           1.6.4
threadpoolctl       3.1.0
tornado             6.1
tqdm                4.64.0
traitlets           5.2.1.post0
typing_extensions   NA
umap                0.5.3
wcwidth             0.2.5
yaml                6.0
zipp                NA
zmq                 23.0.0
-----
IPython             8.4.0
jupyter_client      7.3.1
jupyter_core        4.10.0
-----
Python 3.8.10 (default, Mar 13 2023, 10:26:41) [GCC 9.4.0]
Linux-5.15.0-67-generic-x86_64-with-glibc2.29
-----
Session information updated at 2023-03-31 16:16
orenbenkiki commented 1 year ago

I'm assuming you are running v0.8 (from pip install). There's an issue with the progress bar computations in v0.8. Simplest workaround is to run the computation w/o a progress bar.

We are working on v0.9 - the head version is our pre-release. However as described in the README of the project, v0.9 has breaking changes compared to v0.8, and while the head version seems to be working, it is still in testing. YMMV and all that.

zvittorio commented 1 year ago

Yes I am running v0.8 from pip install, and just removing the progress bar from the code solved the problem. Thank you for the clarification, and looking forward to stable v0.9!

Vittorio