Zafar-Lab / Margaret

Metric learning-based graph-partitioned trajectory inference from single-cell data
MIT License
11 stars 5 forks source link

AttributeError: module 'virtualenv.create.via_global_ref.builtin.cpython.mac_os' has no attribute 'CPython2macOsFramework #36

Open Rohit-Satyam opened 1 year ago

Rohit-Satyam commented 1 year ago

While creating environment I get the following error: Can you please help?

pipenv install
Creating a virtualenv for this project...
Pipfile: /home/subudhak/Documents/zena_scrnaseq_singleR/extra_analysis_June2023/Margaret/Pipfile
Using /usr/bin/python3.8 (3.8.10) to create virtualenv...
⠏ Creating virtual environment...AttributeError: module 'virtualenv.create.via_global_ref.builtin.cpython.mac_os' has no attribute 'CPython2macOsFramework'

✘ Failed creating virtual environment
[pipenv.exceptions.VirtualenvCreationException]: 
Failed to create virtual environment.
Rohit-Satyam commented 1 year ago

Hi I tried installation of individual dependencies for Margaret but there seems to be conflict between the packages used by Margaret and additional packages used in the tutorial. Truth be told, it has become a nightmare and it's been over 4 days and I am unable to figure out what should be done. Besides, your Dockerfile is empty (please remove it in next update or populate it, that would be helpful).

Rohit-Satyam commented 1 year ago

Currently managing the dependencies like this:

mamba create -n margaret -c conda-forge -c bioconda -c hcc -c grst c-blosc2 click cython dca gprofiler-official hnswlib jupyter notebook leidenalg louvain matplotlib  mygene msgpack-python numpy numexpr pandas phate phenograph pybind11 pygam python-igraph pytorch scanpy scikit-learn scipy sh termcolor tqdm umap-learn -y

mamba activate margaret 
pip install palantir #Note:  It downgrades pandas 2.0.3 to 1.5.8
pip install magic-impute ## install this package using pip coz the high priority channel conda-forge otherwise download wrong magic package

Edit 1: I think from utils.util import run_pca here must be changed to from margaret.utils.util import run_pca when user is running the notebook from git cloned Margaret directory because it is not straightforward for first timers to know if run_pca is a function of utils package.

Edit 2: If you are using method above using latest python 3.11, make the following changes in the util.py script. Line 191 change np.int to np.int_. Also watch out here

Edit 3: Is X_met_embedding same as metric_embedding because while following your tutorial and when I try plotting connectivity graph, I get the following error:

KeyError                                  Traceback (most recent call last)
Cell In[21], line 1
----> 1 plot_connectivity_graph(data.obsm['X_met_embedding'], communities, un_connectivity, mode='undirected', offset=0.2, cmap='Blues', node_size=750)

File ~/miniconda3/envs/margaret/lib/python3.11/site-packages/anndata/_core/aligned_mapping.py:178, in AlignedActualMixin.__getitem__(self, key)
    177 def __getitem__(self, key: str) -> V:
--> 178     return self._data[key]

KeyError: 'X_met_embedding'

Edit 4: Use of this PCA calculation code chunk?

import magic

# Apply MAGIC for PCA data denoising
magic_op = magic.MAGIC(random_state=random_seed, solver='approximate', n_pca=n_comps)
X_magic = magic_op.fit_transform(data.X, genes='pca_only')
data.obsm['X_magic_pca'] = X_magic

When using magic_key=X_magic_pca in the plot_connectivity_graph_with_gene_expressions function (see trajectory Visualization), I get an error

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[43], line 4
      1 # from utils.plot import plot_connectivity_graph_with_gene_expressions
----> 4 plot_connectivity_graph_with_gene_expressions(
      5     data,
      6     un_connectivity,
      7     'THBD',
      8     magic_key='X_magic_pca',
      9     font_color='white',
     10     cmap='viridis',
     11     offset=0.2,
     12     node_size=750,
     13     font_size=12,
     14     comm_key='metric_clusters',
     15     save_path='connectivity_CSF1R.png',
     16     save_kwargs={
     17         'dpi': 300,
     18         'bbox_inches': 'tight',
     19         'transparent': True
     20     }
     21 )

Cell In[36], line 34, in plot_connectivity_graph_with_gene_expressions(ad, cluster_connectivities, gene, embedding_key, comm_key, magic_key, mode, cmap, figsize, node_size, font_color, title, save_path, save_kwargs, offset, **kwargs)
     31     raise Exception(f"Key {comm_key} not found in {ad}")
     33 try:
---> 34     X_imputed = pd.DataFrame(
     35         ad.obsm[magic_key], index=ad.obs_names, columns=ad.var_names
     36     )
     37 except KeyError:
     38     print("MAGIC imputed data not found. Using raw counts instead")

File ~/miniconda3/envs/margaret/lib/python3.11/site-packages/pandas/core/frame.py:722, in DataFrame.__init__(self, data, index, columns, dtype, copy)
    712         mgr = dict_to_mgr(
    713             # error: Item "ndarray" of "Union[ndarray, Series, Index]" has no
    714             # attribute "name"
   (...)
    719             typ=manager,
    720         )
    721     else:
--> 722         mgr = ndarray_to_mgr(
    723             data,
    724             index,
    725             columns,
    726             dtype=dtype,
    727             copy=copy,
    728             typ=manager,
    729         )
    731 # For data is list-like, or Iterable (will consume into list)
    732 elif is_list_like(data):

File ~/miniconda3/envs/margaret/lib/python3.11/site-packages/pandas/core/internals/construction.py:349, in ndarray_to_mgr(values, index, columns, dtype, copy, typ)
    344 # _prep_ndarraylike ensures that values.ndim == 2 at this point
    345 index, columns = _get_axes(
    346     values.shape[0], values.shape[1], index=index, columns=columns
    347 )
--> 349 _check_values_indices_shape_match(values, index, columns)
    351 if typ == "array":
    353     if issubclass(values.dtype.type, str):

File ~/miniconda3/envs/margaret/lib/python3.11/site-packages/pandas/core/internals/construction.py:420, in _check_values_indices_shape_match(values, index, columns)
    418 passed = values.shape
    419 implied = (len(index), len(columns))
--> 420 raise ValueError(f"Shape of passed values is {passed}, indices imply {implied}")

ValueError: Shape of passed values is (5780, 300), indices imply (5780, 14651)

I ended up using all_genes rather than pca_only

import magic

magic_op = magic.MAGIC(random_state=0, solver='approximate')
X_magic = magic_op.fit_transform(data.X, genes='all_genes')
data.obsm['X_magic'] = X_magic