Open HelloWorldLTY opened 11 months ago
Hi, I found a bug in the tutorial for running imputation based on MERFISH dataset:
Moreover, I wonder why we need to set different arrays for running this step:
Thanks a lot.
Hi, Thanks for reporting this bug. It seems that we have made some updates to the ConcatCells function. Now, we have updated the tutorial notebook accordingly. Please use git pull to update your local repository.
The reason why we set different arrays to run this step is due to the GPU memory limitation, we recommend handle 1000 spots at a time. e.g., 0,1000 means 0 to 1000-th spot
Thanks, Jiashun
Hi, thanks for your answers. I have a furhter question in the annotation step prior this instruction:
2023-12-22 00:00:33,125 INFO worker.py:1518 -- Started a local Ray instance.
Traceback (most recent call last):
File "/gpfs/gibbs/pi/zhao/tl688/SpatialScope/./src/Cell_Type_Identification.py", line 305, in <module>
CTI.CellTypeIdentification(nu = args.nu, n_neighbo = args.n_neighbo, hs_ST = args.hs_ST, VisiumCellsPlot = args.VisiumCellsPlot, UMI_min_sigma = args.UMI_min_sigma)
File "/gpfs/gibbs/pi/zhao/tl688/SpatialScope/./src/Cell_Type_Identification.py", line 167, in CellTypeIdentification
self.WarmStart(hs_ST=hs_ST, UMI_min_sigma = UMI_min_sigma)
File "/gpfs/gibbs/pi/zhao/tl688/SpatialScope/./src/Cell_Type_Identification.py", line 93, in WarmStart
self.LoadLikelihoodTable()
File "/gpfs/gibbs/pi/zhao/tl688/SpatialScope/./src/Cell_Type_Identification.py", line 141, in LoadLikelihoodTable
Q1[str(i + 10)] = np.reshape(np.array(lines[i].split(' ')).astype(np.float), (2536, 103)).T
File "/gpfs/gibbs/project/zhao/tl688/conda_envs/SpatialScope/lib/python3.9/site-packages/numpy/__init__.py", line 324, in __getattr__
raise AttributeError(__former_attrs__[attr])
AttributeError: module 'numpy' has no attribute 'float'.
`np.float` was a deprecated alias for the builtin `float`. To avoid this error in existing code, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
It seems that the pacakges used in the github do not matched its experimental version.
Moreover, I am curious about the device requirement for running this model for imputation. I only have one GPU with 40 GB memory, so is it possible for me to run your model for imputation? It seems that I need to call multiple GPU cores for imputation. Thanks.
Hi, thanks for your answers. I have a furhter question in the annotation step prior this instruction:
2023-12-22 00:00:33,125 INFO worker.py:1518 -- Started a local Ray instance. Traceback (most recent call last): File "/gpfs/gibbs/pi/zhao/tl688/SpatialScope/./src/Cell_Type_Identification.py", line 305, in <module> CTI.CellTypeIdentification(nu = args.nu, n_neighbo = args.n_neighbo, hs_ST = args.hs_ST, VisiumCellsPlot = args.VisiumCellsPlot, UMI_min_sigma = args.UMI_min_sigma) File "/gpfs/gibbs/pi/zhao/tl688/SpatialScope/./src/Cell_Type_Identification.py", line 167, in CellTypeIdentification self.WarmStart(hs_ST=hs_ST, UMI_min_sigma = UMI_min_sigma) File "/gpfs/gibbs/pi/zhao/tl688/SpatialScope/./src/Cell_Type_Identification.py", line 93, in WarmStart self.LoadLikelihoodTable() File "/gpfs/gibbs/pi/zhao/tl688/SpatialScope/./src/Cell_Type_Identification.py", line 141, in LoadLikelihoodTable Q1[str(i + 10)] = np.reshape(np.array(lines[i].split(' ')).astype(np.float), (2536, 103)).T File "/gpfs/gibbs/project/zhao/tl688/conda_envs/SpatialScope/lib/python3.9/site-packages/numpy/__init__.py", line 324, in __getattr__ raise AttributeError(__former_attrs__[attr]) AttributeError: module 'numpy' has no attribute 'float'. `np.float` was a deprecated alias for the builtin `float`. To avoid this error in existing code, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here. The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
It seems that the pacakges used in the github do not matched its experimental version.
Moreover, I am curious about the device requirement for running this model for imputation. I only have one GPU with 40 GB memory, so is it possible for me to run your model for imputation? It seems that I need to call multiple GPU cores for imputation. Thanks.
Hi, thanks for reporting this bug, we ignored the DeprecationWarning of np.float
before. Now we have updated the np.float
to np.float64
The minimum GPU requirement for SpatialScope is 2080 Ti (12GB). It's okay to use only one GPU; multiple GPUs were intended to speed up the imputation process. However, limited by GPU memory, we recommend imputing 1000 cells at a time when 40 GB of memory is available.
Thanks Jiashun
Hi, thanks for your answer. I further meet another problem
2023-12-24 16:17:46,625 : INFO : fitBulk: decomposing bulk
2023-12-24 16:17:47,345 : INFO : chooseSigma: using initial Q_mat with sigma = 1.0
Traceback (most recent call last):
File "/gpfs/gibbs/pi/zhao/tl688/SpatialScope/./src/Cell_Type_Identification.py", line 305, in <module>
CTI.CellTypeIdentification(nu = args.nu, n_neighbo = args.n_neighbo, hs_ST = args.hs_ST, VisiumCellsPlot = args.VisiumCellsPlot, UMI_min_sigma = args.UMI_min_sigma)
File "/gpfs/gibbs/pi/zhao/tl688/SpatialScope/./src/Cell_Type_Identification.py", line 167, in CellTypeIdentification
self.WarmStart(hs_ST=hs_ST, UMI_min_sigma = UMI_min_sigma)
File "/gpfs/gibbs/pi/zhao/tl688/SpatialScope/./src/Cell_Type_Identification.py", line 121, in WarmStart
myRCTD = run_RCTD(myRCTD, self.Q_mat_all, self.X_vals_loc, loggings = self.loggings)
File "/gpfs/gibbs/pi/zhao/tl688/SpatialScope/src/utils_pyRCTD.py", line 34, in run_RCTD
RCTD = choose_sigma_c(RCTD, Q_mat_all, X_vals_loc, loggings = loggings)
File "/gpfs/gibbs/pi/zhao/tl688/SpatialScope/src/utils_pyRCTD.py", line 430, in choose_sigma_c
results = decompose_batch(np.array(puck['nUMI'].loc[fit_ind]).squeeze(), RCTD['cell_type_info']['renorm']['cell_type_means'], beads, RCTD['internal_vars']['gene_list_reg'], constrain = False, max_cores = RCTD['config']['max_cores'], loggings = loggings,likelihood_vars = likelihood_vars)
File "/gpfs/gibbs/pi/zhao/tl688/SpatialScope/src/utils_pyRCTD.py", line 234, in decompose_batch
weights = ray.get([decompose_full_ray.remote(arg) for arg in inp_args])
File "/gpfs/gibbs/pi/zhao/tl688/SpatialScope/src/utils_pyRCTD.py", line 234, in <listcomp>
weights = ray.get([decompose_full_ray.remote(arg) for arg in inp_args])
File "/gpfs/gibbs/project/zhao/tl688/conda_envs/SpatialScope/lib/python3.9/site-packages/ray/remote_function.py", line 121, in _remote_proxy
return self._remote(args=args, kwargs=kwargs, **self._default_options)
File "/gpfs/gibbs/project/zhao/tl688/conda_envs/SpatialScope/lib/python3.9/site-packages/ray/util/tracing/tracing_helper.py", line 307, in _invocation_remote_span
return method(self, args, kwargs, *_args, **_kwargs)
File "/gpfs/gibbs/project/zhao/tl688/conda_envs/SpatialScope/lib/python3.9/site-packages/ray/remote_function.py", line 393, in _remote
return invocation(args, kwargs)
File "/gpfs/gibbs/project/zhao/tl688/conda_envs/SpatialScope/lib/python3.9/site-packages/ray/remote_function.py", line 369, in invocation
object_refs = worker.core_worker.submit_task(
File "python/ray/_raylet.pyx", line 1536, in ray._raylet.CoreWorker.submit_task
File "python/ray/_raylet.pyx", line 1540, in ray._raylet.CoreWorker.submit_task
File "python/ray/_raylet.pyx", line 385, in ray._raylet.prepare_args_and_increment_put_refs
File "python/ray/_raylet.pyx", line 376, in ray._raylet.prepare_args_and_increment_put_refs
File "python/ray/_raylet.pyx", line 418, in ray._raylet.prepare_args_internal
File "/gpfs/gibbs/project/zhao/tl688/conda_envs/SpatialScope/lib/python3.9/site-packages/ray/_private/worker.py", line 536, in get_serialization_context
context_map[job_id] = serialization.SerializationContext(self)
File "/gpfs/gibbs/project/zhao/tl688/conda_envs/SpatialScope/lib/python3.9/site-packages/ray/_private/serialization.py", line 124, in __init__
serialization_addons.apply(self)
File "/gpfs/gibbs/project/zhao/tl688/conda_envs/SpatialScope/lib/python3.9/site-packages/ray/util/serialization_addons.py", line 56, in apply
register_pydantic_serializer(serialization_context)
File "/gpfs/gibbs/project/zhao/tl688/conda_envs/SpatialScope/lib/python3.9/site-packages/ray/util/serialization_addons.py", line 19, in register_pydantic_serializer
pydantic.fields.ModelField,
AttributeError: module 'pydantic.fields' has no attribute 'ModelField'
It seems that the pydantic provided in the installing process does not match your experiment environment. I wonder if I can access the version of pydantic.
Moreover, I wonder if I have cell types for both scRNA-seq and spatial data, can I skip this step (cell-type-identification) for imputation? Thanks.
Hi, thanks for your answer. I further meet another problem
2023-12-24 16:17:46,625 : INFO : fitBulk: decomposing bulk 2023-12-24 16:17:47,345 : INFO : chooseSigma: using initial Q_mat with sigma = 1.0 Traceback (most recent call last): File "/gpfs/gibbs/pi/zhao/tl688/SpatialScope/./src/Cell_Type_Identification.py", line 305, in <module> CTI.CellTypeIdentification(nu = args.nu, n_neighbo = args.n_neighbo, hs_ST = args.hs_ST, VisiumCellsPlot = args.VisiumCellsPlot, UMI_min_sigma = args.UMI_min_sigma) File "/gpfs/gibbs/pi/zhao/tl688/SpatialScope/./src/Cell_Type_Identification.py", line 167, in CellTypeIdentification self.WarmStart(hs_ST=hs_ST, UMI_min_sigma = UMI_min_sigma) File "/gpfs/gibbs/pi/zhao/tl688/SpatialScope/./src/Cell_Type_Identification.py", line 121, in WarmStart myRCTD = run_RCTD(myRCTD, self.Q_mat_all, self.X_vals_loc, loggings = self.loggings) File "/gpfs/gibbs/pi/zhao/tl688/SpatialScope/src/utils_pyRCTD.py", line 34, in run_RCTD RCTD = choose_sigma_c(RCTD, Q_mat_all, X_vals_loc, loggings = loggings) File "/gpfs/gibbs/pi/zhao/tl688/SpatialScope/src/utils_pyRCTD.py", line 430, in choose_sigma_c results = decompose_batch(np.array(puck['nUMI'].loc[fit_ind]).squeeze(), RCTD['cell_type_info']['renorm']['cell_type_means'], beads, RCTD['internal_vars']['gene_list_reg'], constrain = False, max_cores = RCTD['config']['max_cores'], loggings = loggings,likelihood_vars = likelihood_vars) File "/gpfs/gibbs/pi/zhao/tl688/SpatialScope/src/utils_pyRCTD.py", line 234, in decompose_batch weights = ray.get([decompose_full_ray.remote(arg) for arg in inp_args]) File "/gpfs/gibbs/pi/zhao/tl688/SpatialScope/src/utils_pyRCTD.py", line 234, in <listcomp> weights = ray.get([decompose_full_ray.remote(arg) for arg in inp_args]) File "/gpfs/gibbs/project/zhao/tl688/conda_envs/SpatialScope/lib/python3.9/site-packages/ray/remote_function.py", line 121, in _remote_proxy return self._remote(args=args, kwargs=kwargs, **self._default_options) File "/gpfs/gibbs/project/zhao/tl688/conda_envs/SpatialScope/lib/python3.9/site-packages/ray/util/tracing/tracing_helper.py", line 307, in _invocation_remote_span return method(self, args, kwargs, *_args, **_kwargs) File "/gpfs/gibbs/project/zhao/tl688/conda_envs/SpatialScope/lib/python3.9/site-packages/ray/remote_function.py", line 393, in _remote return invocation(args, kwargs) File "/gpfs/gibbs/project/zhao/tl688/conda_envs/SpatialScope/lib/python3.9/site-packages/ray/remote_function.py", line 369, in invocation object_refs = worker.core_worker.submit_task( File "python/ray/_raylet.pyx", line 1536, in ray._raylet.CoreWorker.submit_task File "python/ray/_raylet.pyx", line 1540, in ray._raylet.CoreWorker.submit_task File "python/ray/_raylet.pyx", line 385, in ray._raylet.prepare_args_and_increment_put_refs File "python/ray/_raylet.pyx", line 376, in ray._raylet.prepare_args_and_increment_put_refs File "python/ray/_raylet.pyx", line 418, in ray._raylet.prepare_args_internal File "/gpfs/gibbs/project/zhao/tl688/conda_envs/SpatialScope/lib/python3.9/site-packages/ray/_private/worker.py", line 536, in get_serialization_context context_map[job_id] = serialization.SerializationContext(self) File "/gpfs/gibbs/project/zhao/tl688/conda_envs/SpatialScope/lib/python3.9/site-packages/ray/_private/serialization.py", line 124, in __init__ serialization_addons.apply(self) File "/gpfs/gibbs/project/zhao/tl688/conda_envs/SpatialScope/lib/python3.9/site-packages/ray/util/serialization_addons.py", line 56, in apply register_pydantic_serializer(serialization_context) File "/gpfs/gibbs/project/zhao/tl688/conda_envs/SpatialScope/lib/python3.9/site-packages/ray/util/serialization_addons.py", line 19, in register_pydantic_serializer pydantic.fields.ModelField, AttributeError: module 'pydantic.fields' has no attribute 'ModelField'
It seems that the pydantic provided in the installing process does not match your experiment environment. I wonder if I can access the version of pydantic.
Moreover, I wonder if I have cell types for both scRNA-seq and spatial data, can I skip this step (cell-type-identification) for imputation? Thanks.
Hi,
As many reported issues are related to the environment, we have provided a docker image (docker pull xiaojs95/spatialscope) to avoid installation problems. See the project homepage for more details if needed.
if cell types for both scRNA-seq and spatial data are available, (cell-type-identification) can be skipped as long as the cell types are matched beween scRNA-seq and spatial data.
Thanks, Jiashun
Thanks a lot. I will try it and back to you. It seems that you have provided the version of pydantic.
Moreover, it seems that the docker file contains the information for installing, same as the environment.yml file provided in your repo, I think it does not make difference if I initially chose to install it based on environment.yml.
RUN apt-get update && apt-get install -y git rsync
# Clone the repository from GitHub
RUN git clone https://github.com/YangLabHKUST/SpatialScope.git
RUN cd SpatialScope
WORKDIR /home/SpatialScope
# Create and activate the Conda environment
RUN conda env create -f environment.yml # This is the exact step I run.
Thanks, after updating pydantic, I addressed my problems of running annotation.
However, there seems like another problem of imputing step:
+ arr=("0,1000" "1000,2000" "2000,3000" "3000,4000" "4000,5000" "5000,5551")
+ declare -a arr
+ for i in "${arr[@]}"
+ python ./src/Decomposition.py
Traceback (most recent call last):
File "/gpfs/gibbs/pi/zhao/tl688/SpatialScope/./src/Decomposition.py", line 444, in <module>
DECOM = GeneExpDecomposition(config)
File "/gpfs/gibbs/pi/zhao/tl688/SpatialScope/./src/Decomposition.py", line 29, in __init__
self.out_dir = os.path.join(self.config.data.out_dir, self.config.data.tissue)
File "/gpfs/gibbs/project/zhao/tl688/conda_envs/SpatialScope/lib/python3.9/posixpath.py", line 76, in join
a = os.fspath(a)
TypeError: expected str, bytes or os.PathLike object, not NoneType
Here is the error. I directly copied the codes from the tutorial. I did have the path shown above. Could you please help me? Thanks.
Thanks, after updating pydantic, I addressed my problems of running annotation.
However, there seems like another problem of imputing step:
+ arr=("0,1000" "1000,2000" "2000,3000" "3000,4000" "4000,5000" "5000,5551") + declare -a arr + for i in "${arr[@]}" + python ./src/Decomposition.py Traceback (most recent call last): File "/gpfs/gibbs/pi/zhao/tl688/SpatialScope/./src/Decomposition.py", line 444, in <module> DECOM = GeneExpDecomposition(config) File "/gpfs/gibbs/pi/zhao/tl688/SpatialScope/./src/Decomposition.py", line 29, in __init__ self.out_dir = os.path.join(self.config.data.out_dir, self.config.data.tissue) File "/gpfs/gibbs/project/zhao/tl688/conda_envs/SpatialScope/lib/python3.9/posixpath.py", line 76, in join a = os.fspath(a) TypeError: expected str, bytes or os.PathLike object, not NoneType
Here is the error. I directly copied the codes from the tutorial. I did have the path shown above. Could you please help me? Thanks.
This is due to markdown display problem, add '\' in the end of each line, except for the last line
Thanks a lot. After fixing this bug, I meet a new error:
(SpatialScope)[tl688@r208u22n01.mccleary SpatialScope]$ bash imputation.sh
+ arr=("0,1000" "1000,2000" "2000,3000" "3000,4000" "4000,5000" "5000,5551")
+ declare -a arr
+ for i in "${arr[@]}"
+ python src/Decomposition.py --tissue merfish --out_dir ./output --SC_Data ./Ckpts_scRefs/MOp/Ref_snRNA_mop_qc3_2Kgenes.h5ad
2024-01-25 00:43:53,013 : INFO : load scRNA-seq reference: ./Ckpts_scRefs/MOp/Ref_snRNA_mop_qc3_2Kgenes.h5ad
Traceback (most recent call last):
File "/gpfs/gibbs/project/zhao/tl688/conda_envs/SpatialScope/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3621, in get_loc
return self._engine.get_loc(casted_key)
File "pandas/_libs/index.pyx", line 136, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 163, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 5198, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 5206, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'cell_type'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/gpfs/gibbs/pi/zhao/tl688/SpatialScope/src/Decomposition.py", line 449, in <module>
DECOM.decomposition()
File "/gpfs/gibbs/pi/zhao/tl688/SpatialScope/src/Decomposition.py", line 366, in decomposition
self.LoadScData()
File "/gpfs/gibbs/pi/zhao/tl688/SpatialScope/src/Decomposition.py", line 49, in LoadScData
cell_type_array = np.array(self.sc_data_process_marker.obs[self.config.data.cell_class_column])
File "/gpfs/gibbs/project/zhao/tl688/conda_envs/SpatialScope/lib/python3.9/site-packages/pandas/core/frame.py", line 3505, in __getitem__
indexer = self.columns.get_loc(key)
File "/gpfs/gibbs/project/zhao/tl688/conda_envs/SpatialScope/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3623, in get_loc
raise KeyError(key) from err
KeyError: 'cell_type'
I think there exist mismatched information between the cell_class_column and your key for markers. I used the most updated codes. Thanks.
arguments after --SC_Data were missing
Thanks a lot, now the training process worked for me.
Hi, I found a bug in the tutorial for running imputation based on MERFISH dataset:
Moreover, I wonder why we need to set different arrays for running this step:
Thanks a lot.