Open XYZuo opened 3 years ago
Hi,
Thanks for the feedback. Unfortunately I was not able to reproduce the error when playing around with example data. I am happy to take a closer look if you can share with me the file 'labels.tsv'.
But in your case, you can actually skip the step st.add_cell_labels()
. This is equivalent to
adata.obs['label'] = adata.obs['cell_types'].copy()
Let me know if this works for you.
Thank you for your help! I skip the step st.add_cell_labels() and it works. But the image I got was not consistent with the cell types I annotated. I want the HSC group to be in the starting position. Could I set the root site by myself? I guess the 'init_nodes_pos’ in st.seed_elastic_principal_graph could realize this, but I don't know how to set it.
Hi,
Thanks for the feedback. Unfortunately I was not able to reproduce the error when playing around with example data. I am happy to take a closer look if you can share with me the file 'labels.tsv'.
But in your case, you can actually skip the step
st.add_cell_labels()
. This is equivalent toadata.obs['label'] = adata.obs['cell_types'].copy()
Let me know if this works for you.
Yes, you can. The pseudotime with different nodes will be all computed once the tree structure is learnt. The pseudotime info is stored in adata.obs
So you can simply replace 'S4' with the root node you desire. E.g., in your case, you can replace S4_pseudotime
with S5_pseudotime
for HSC cells as the root. (I'm not 100% sure about the color here but it seems HSCs all gather around S5 node )
Thanks for your advice! It worked. But I find that I can't add my annotaion color if I don't follow STREAM tutorial to add the colors by 'st.add_cell_colors'. My color annotations are stored in adata.obs.label_color, which match the cluster labels in adata.obs.label. How could I use my annotation color when plotting the stream?
Yes, you can. The pseudotime with different nodes will be all computed once the tree structure is learnt. The pseudotime info is stored in
adata.obs
So you can simply replace 'S4' with the root node you desire. E.g., in your case, you can replace
S4_pseudotime
withS5_pseudotime
for HSC cells as the root. (I'm not 100% sure about the color here but it seems HSCs all gather around S5 node )
For now I guess it has to be done in a hacky way..
You can add your own colors by :
adata.uns['label_color'] = pd.Series(data=adata.obs['label_color'].tolist(),index=adata.obs['label'].tolist()).to_dict()
But this is something that will certainly be addressed in our stream v2.
Thank you so much. Unfortunately it gave an error message after I run st.plot_stream_sc:
And I met a similar error with issue 115 (https://github.com/pinellolab/STREAM/issues/115) after running st.plot_stream(adata,root='S5',color=['label'],save_fig=True, fig_format='pdf')
Traceback (most recent call last):
File "
File "/home/cchu5/miniconda3/envs/zxy-stream/lib/python3.7/site-packages/pandas/core/indexing.py", line 1769, in _setitem_with_indexer_split_path key = tuple([key] + [slice(None)] * (len(labels.levels) - 1)) File "/home/cchu5/miniconda3/envs/zxy-stream/lib/python3.7/site-packages/pandas/core/indexing.py", line 1830, in _setitem_with_indexer_2d_value ): ValueError: Must have equal len keys and value when setting with an ndarray
I tried to downgrade to pandas==1.0 or any other versions, but it didn't work.
Sorry to encroach upon your time. I am also looking forward to the release of stream v2.
For now I guess it has to be done in a hacky way..
You can add your own colors by :
adata.uns['label_color'] = pd.Series(data=adata.obs['label_color'].tolist(),index=adata.obs['label'].tolist()).to_dict()
But this is something that will certainly be addressed in our stream v2.
I am sorry that you have to go through these tricky steps to use stream.
Unfortunately I am not sure how to address this issue as I have not run into it or been able to reproduce it myself.
If you can share with me a script and a dummy dataset to reproduce the error, I am more than happy to take a closer look.
Hi, Thank you for your patience! Strangely, after I added the label_color in seurat object and then transfered it to loom file, the error about color disappeared. But the second error after running st.plot_stream still exists. I put a test loom file here https://github.com/ZxyChopcat/STREAMtest/blob/master/STREAMtest.zip And this is my scripts:
import stream as st st.version import pandas as pd import numpy as np import anndata as ad import matplotlib matplotlib.use('pdf') import matplotlib.pyplot as plt adata = ad.read_loom("/zxy/STREAM/itp.data1.2.STREAM.loom", sparse=True, cleanup=False, X_name='spliced', obs_names='CellID', var_names='Gene', dtype='float32') st.set_workdir(adata,'/data/tmp_data/zxy/STREAM') adata.var_names_make_unique() adata.obsm['top_pcs'] = adata.obsm['pca_cell_embeddings'] adata.obsm['X_dr'] = adata.obsm['umap_cell_embeddings'] adata.obsm['X_vis_umap'] = adata.obsm['umap_cell_embeddings'][:,:2] adata.uns['label_color'] = pd.Series(data=adata.obs['label_color'].tolist(),index=adata.obs['label'].tolist()).to_dict() st.plot_visualization_2D(adata,method='umap',n_neighbors=50,color=['label'],use_precomputed=True,save_fig=True, fig_name='visualization_2D.pdf') st.seed_elastic_principal_graph(adata,n_clusters=10,use_vis=True) st.elastic_principal_graph(adata,epg_alpha=0.01,epg_mu=0.05,epg_lambda=0.05,save_fig=True, fig_name='ElPiGraph_analysis.pdf') st.plot_dimension_reduction(adata,color=['label'],n_components=2,show_graph=True,show_text=False,save_fig=True, fig_name='dimension_reduction.pdf') st.plot_branches(adata,show_text=True,save_fig=True, fig_name='branches.pdf') st.plot_flat_tree(adata,color=['label','branch_id_alias','S5_pseudotime'],dist_scale=0.5,show_graph=True,show_text=True,save_fig=True,fig_name='flat_tree.pdf') st.plot_stream_sc(adata,root='S5',color=['label','GATA1'],dist_scale=0.5,show_graph=True,show_text=False,save_fig=True, fig_format='pdf',fig_size=(14,9)) st.plot_stream(adata,root='S5',color=['label','GATA1'],save_fig=True, fig_format='pdf')
I am sorry that you have to go through these tricky steps to use stream.
Unfortunately I am not sure how to address this issue as I have not run into it or been able to reproduce it myself.
If you can share with me a script and a dummy dataset to reproduce the error, I am more than happy to take a closer look.
hmmm, that is very strange.
I just tested your script and I was able to run it without any errors.
I am attaching the notebook I was using here. test_stream.html.zip
So it is likely that there is an error in my environment. I created the conda environment by 'create -n stream python=3.7 stream=1.0 jupyter'. And here is my pip list: Package Version
anndata 0.7.3 argcomplete 1.12.3 argon2-cffi 20.1.0 async-generator 1.10 attrs 21.2.0 backcall 0.2.0 bleach 4.0.0 Bottleneck 1.3.2 cached-property 1.5.2 certifi 2021.5.30 cffi 1.14.6 click 8.0.2 cycler 0.10.0 debugpy 1.4.1 decorator 5.1.0 defusedxml 0.7.1 entrypoints 0.3 fonttools 4.25.0 gunicorn 20.1.0 h5py 3.2.1 importlib-metadata 4.8.1 ipykernel 6.2.0 ipython 7.27.0 ipython-genutils 0.2.0 ipywidgets 7.6.4 jedi 0.18.0 Jinja2 3.0.1 joblib 1.0.1 jsonschema 3.2.0 jupyter 1.0.0 jupyter-client 7.0.1 jupyter-console 6.4.0 jupyter-core 4.7.1 jupyterlab-pygments 0.1.2 jupyterlab-widgets 1.0.0 kiwisolver 1.3.1 llvmlite 0.36.0 loompy 3.0.6 MarkupSafe 2.0.1 matplotlib 3.2.2 matplotlib-inline 0.1.2 mistune 0.8.4 mkl-fft 1.3.0 mkl-random 1.2.2 mkl-service 2.4.0 munkres 1.1.4 natsort 7.1.1 nbclient 0.5.3 nbconvert 6.1.0 nbformat 5.1.3 nest-asyncio 1.5.1 networkx 2.1 notebook 6.4.3 numba 0.53.1 numexpr 2.7.3 numpy 1.17.5 numpy-groupies 0.9.14 olefile 0.46 packaging 21.0 pandas 1.0.5 pandocfilters 1.4.3 parso 0.8.2 patsy 0.5.1 pexpect 4.8.0 pickleshare 0.7.5 Pillow 8.3.1 pip 21.2.2 plotly 5.1.0 prometheus-client 0.11.0 prompt-toolkit 3.0.20 ptyprocess 0.7.0 pycparser 2.20 Pygments 2.10.0 pynndescent 0.5.4 pyparsing 2.4.7 pyrsistent 0.17.3 python-dateutil 2.8.2 python-slugify 5.0.2 pytz 2021.1 pyzmq 22.2.1 qtconsole 5.1.1 QtPy 1.10.0 rpy2 2.9.4 scikit-learn 0.24.2 scipy 1.7.1 seaborn 0.11.2 Send2Trash 1.8.0 setuptools 58.0.4 Shapely 1.7.1 simplegeneric 0.8.1 six 1.15.0 statsmodels 0.12.2 stream 1.0 tenacity 8.0.1 terminado 0.9.4 testpath 0.5.0 text-unidecode 1.3 threadpoolctl 2.2.0 tornado 6.1 traitlets 5.1.0 typing-extensions 3.10.0.2 tzlocal 2.1 umap-learn 0.5.1 Unidecode 1.2.0 wcwidth 0.2.5 webencodings 0.5.1 wheel 0.37.0 widgetsnbextension 3.5.1 zipp 3.5.0
Can you find anything wrong?
hmmm, that is very strange.
I just tested your script and I was able to run it without any errors.
I am attaching the notebook I was using here. test_stream.html.zip
Hi, I'm using a loom file saving from a seurat object. My adata is like:
AnnData object with n_obs × n_vars = 56109 × 22965 obs: 'ClusterID', 'ClusterName', 'DF_classification', 'RNA_snn_res_1_5', 'cell_types', 'gender', 'group', 'nCount_RNA', 'nFeature_RNA', 'orig_ident', 'percent_hsp', 'percent_mt', 'percent_rb', 'seurat_clusters', 'label_color' var: 'Selected', 'vst_mean', 'vst_variable', 'vst_variance', 'vst_variance_expected', 'vst_variance_standardized' uns: 'label_color', 'workdir' obsm: 'harmony_cell_embeddings', 'pca_cell_embeddings', 'umap_cell_embeddings' varm: 'harmony_feature_loadings_projected', 'pca_feature_loadings' layers: 'norm_data', 'scale_data'
I extracted my cell labels by this: adata.obs['cell_types'].to_csv('labels.tsv',sep='\t',header=0)
But when I try to add it to my object by this: st.add_cell_labels(adata, file_name = 'labels.tsv')
It came an error: ValueError: Length mismatch: Expected axis has 56110 elements, new values have 56109 elements
I checked my adata, there seems 56109 cells with no problem:
Could you please help me? I can't figure it out.