digitalcytometry / cytospace

CytoSPACE: Optimal mapping of scRNA-seq data to spatial transcriptomics data
Other
115 stars 19 forks source link

run_cytospace Issue after successful install #77

Closed dmiyagi closed 1 year ago

dmiyagi commented 1 year ago

Hi I am having a bit of an issue with the "run_cytospace" I didn't get any install errors and followed the instructions. I am currently in the cytospace environment, but am getting this error:

where should run_cytospace be located? Maybe I can try to see if I can find the folder.

Processing /path/tools/cytospace
  Preparing metadata (setup.py) ... done
Building wheels for collected packages: cytospace
  Building wheel for cytospace (setup.py) ... done
  Created wheel for cytospace: filename=cytospace-1.0.6a0-py3-none-any.whl size=24713 sha256=61219de73924882b941e626a7653d9fdab29a234
9c72c72d3b5724fc52bb65f0
  Stored in directory: /tmp/pip-ephem-wheel-cache-472_1xb4/wheels/db/13/00/86ea339a996a0120483fe8b0284ca01c270c745e7854dd35bb
Successfully built cytospace
Installing collected packages: cytospace
Successfully installed cytospace-1.0.6a0
(cytospace)[dfm42@r206u22n08.mccleary cytospace]$ ls
build  cytospace  cytospace.egg-info  environment.yml  images  LICENSE  README.md  setup.py  uncertainty_quantification.R
(cytospace)[dfm42@r206u22n08.mccleary cytospace]$ cytospace \
>     --scRNA-path /path/scRNA_discovery.txt  \
>     --cell-type-path /path/scRNA_discovery_annotation_cytospace.txt \
>     --spaceranger-path /path/spatial/YALE-NS-0848
Traceback (most recent call last):
  File "/home/dfm42/.local/bin/cytospace", line 5, in <module>
    from cytospace.cytospace import run_cytospace
ModuleNotFoundError: No module named 'cytospace'

I have the following folders after following other instructions: build cytospace cytospace.egg-info environment.yml images LICENSE README.md setup.py uncertainty_quantification.R

WubingZhang commented 1 year ago

Hi,

Based on the error info, cytospace is install into /home/dfm42/.local/bin/ instead of conda environment. To solve this issue, you can try:

  1. Replace the pip with its full path to re-install cytospace. The command is like /path/to/miniconda/envs/cytospace/bin/pip install cytospace.
  2. Or check the path to the pip before installation via which pip or which pip3. If the pip is not from the 'cytospace' conda environment, please add the path to the conda to the $PATH via export PATH=/path/to/envs/cytospace/bin:$PATH. and repeat the installation (pip install cytospace).

Let me know if you get further problem.

Best, Wubing

dmiyagi commented 1 year ago

Hi @WubingZhang thanks for your quick reply!

I tried what you suggested, first with which pip which yielded: /gpfs/gibbs/project/path/to/conda_envs/cytospace/bin/pip as it should then I tried to do the reinstall : /gpfs/gibbs/project/path/to/conda_envs/cytospace/bin/pip install cytospace but then got this error ERROR: Could not find a version that satisfies the requirement cytospace (from versions: none) ERROR: No matching distribution found for cytospace

I then tried (while in the cytospace folder and cytospace env activated): /gpfs/gibbs/project/path/to/conda_envs/cytospace/bin/pip install --user .

which seemed to install again:

Processing /gpfs/gibbs/project/path/to/tools/cytospace Preparing metadata (setup.py) ... done Building wheels for collected packages: cytospace Building wheel for cytospace (setup.py) ... done Created wheel for cytospace: filename=cytospace-1.0.6a0-py3-none-any.whl size=24714 sha256=cbb8a9b98ceb9eb8e97aa8ee079075949 3f2540e97ee553756ddb860ef85df38 Stored in directory: /tmp/pip-ephem-wheel-cache-_htluqbu/wheels/2a/5f/c9/4a73f085b10d9a6a4aef454af40d68ecb956168a98b9b18c56 Successfully built cytospace Installing collected packages: cytospace Successfully installed cytospace-1.0.6a0

I also previously tried to install it /gpfs/gibbs/project/path/to/conda_envs/cytospace/bin because I noticed it is not there where the other packages are, and I was able to get cytospace to be there, but it still doesn't work when I use "cytospace" while the environment is activated. I can try it again though.

If it's useful to know, I'm using an HPC shared with others at my university so maybe that's why it's installing weird?

It should be noted that my conda env cytospace and where the folder I downloaded cytospace to are in two different locations.

WubingZhang commented 1 year ago

Hi,

The problem should be related to the default search path of Python modules. Could you try changing the default Python path and test again?

export PYTHONPATH=/path/to/new/directory:$PYTHONPATH

If this does not help, you might get in touch with the server manager for help.

Best, Wubing

dmiyagi commented 1 year ago

Thank you! that seemed to work and I am able to get past the first step, but now it can't find Rscript, which I know is in my conda_env. Should the cytospace folder where I direct the export PYTHONPATH be installed in the same location as the cytospace conda_env?

Read and validate data ... Estimating cell type fractions Traceback (most recent call last): File "/home/pathto/.local/bin/cytospace", line 8, in <module> sys.exit(run_cytospace()) File "/home/pathto/.local/lib/python3.8/site-packages/cytospace/cytospace.py", line 690, in run_cytospace main_cytospace(**arguments) File "/home/pathto/.local/lib/python3.8/site-packages/cytospace/cytospace.py", line 522, in main_cytospace read_data(scRNA_path, cell_type_path, File "/home/pathto/.local/lib/python3.8/site-packages/cytospace/cytospace.py", line 35, in read_data cell_type_fraction_estimation_path = estimate_cell_type_fractions(scRNA_path, cell_type_path, st_path, output_path, output_prefix) File "/home/pathto/.local/lib/python3.8/site-packages/cytospace/common/common.py", line 119, in estimate_cell_type_fractions out = subprocess.run(run_args, check=True) File "/gpfs/gibbs/project/path/to/conda_envs/cytospace/lib/python3.8/subprocess.py", line 493, in run with Popen(*popenargs, **kwargs) as process: File "/gpfs/gibbs/project/path/to/conda_envs/cytospace/lib/python3.8/subprocess.py", line 858, in __init__ self._execute_child(args, executable, preexec_fn, close_fds, File "/gpfs/gibbs/project/path/to/conda_envs/cytospace/lib/python3.8/subprocess.py", line 1720, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) FileNotFoundError: [Errno 2] No such file or directory: 'Rscript'

WubingZhang commented 1 year ago

Please ensure all dependencies are installed in the exact location and add the corresponding path to the PATH.

export PATH="/path/to/R/bin:$PATH"

Best

dmiyagi commented 1 year ago

perfect that did the trick!

I am getting one more error that is downstream and is

Predicting cell labels Read and validate data ... Estimating cell type fractions Traceback (most recent call last): File "/gpfs/gibbs/project/path/to/conda_envs/cytospace/lib/python3.8/site-packages/cytospace/cytospace.py", line 75, in read_data scRNA_data = scRNA_data[cell_type_data.index] File "/gpfs/gibbs/project/path/to/conda_envs/cytospace/lib/python3.8/site-packages/pandas/core/frame.py", line 3511, in getitem indexer = self.columns._get_indexer_strict(key, "columns")[1] File "/gpfs/gibbs/project/path/to/conda_envs/cytospace/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 5782, in _get_indexer_strict self._raise_if_missing(keyarr, indexer, axis_name) File "/gpfs/gibbs/project/path/to/conda_envs/cytospace/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 5842, in _raise_if_missing raise KeyError(f"None of [{key}] are in the [{axis_name}]") KeyError: "None of [Index(['CELL_0', 'CELL_5', 'CELL_6', 'CELL_12', 'CELL_14', 'CELL_16',\n 'CELL_17', 'CELL_18', 'CELL_19', 'CELL_20',\n ...\n 'CELL_220022', 'CELL_220025', 'CELL_220026', 'C$ During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/gpfs/gibbs/project/path/to/conda_envs/cytospace/bin/cytospace", line 8, in sys.exit(run_cytospace()) File "/gpfs/gibbs/project/path/to/conda_envs/cytospace/lib/python3.8/site-packages/cytospace/cytospace.py", line 690, in run_cytospace main_cytospace(**arguments) File "/gpfs/gibbs/project/path/to/conda_envs/cytospace/lib/python3.8/site-packages/cytospace/cytospace.py", line 522, in main_cytospace read_data(scRNA_path, cell_type_path, File "/gpfs/gibbs/project/path/to/conda_envs/cytospace/lib/python3.8/site-packages/cytospace/cytospace.py", line 83, in read_data raise IndexError(f"The ST data: {st_path} and coordinates data: {coordinates_path} have to " IndexError: The ST data: /gpfs/gibbs/project/path/to/spatial/ExpressionMatrix_cytospace.txt and coordinates data: /gpfs/gibbs/project/path/to/spatial/tissue_positions_cytospace.txt have to have the same spot IDs for columns and rows, respectively, and scRNA data: /gpfs/gibbs/project/path/to/scRNA/scRNA_discovery.txt and cell type data: /gpfs/gibbs/project/path/to/scRNA/scRNA_discovery_annotation_cytospace.txt have to have the same cell IDs for columns and rows, respectively.

I think it is originating from the change that occurs here:

'2023-07-16 20:45:22 Load scRNA data Warning: Feature names cannot have underscores ('_'), replacing with dashes ('-')'

can you tell me whether that line means that it is changing the sc-RNA column names so that the barcodes are all with "-"? if so that would explain why this error is occurring BUT I did try to do that and it seemed to recognize my barcodes now:

KeyError: "None of [Index(['CELL_AAACCCAAGAGGGTCT-1-MSC5-BTI', 'CELL_AAACCCACAAGTATCC-1-MSC5-BTI',\n 'CELL_AAACCCACAATGGCAG-1-MSC5-BTI', 'CELL_AAACCCATCGGCACTG-1-MSC5-BTI',\n 'CELL_AAACGAAAGATTTGCC-1-MSC5-BTI', 'CELL_AAACGAAAGGGCCCTT-1-MSC5-BTI',\n 'CELL_AAACGAACACGACTAT-1-MSC5-BTI', 'CELL_AAACGAAGTAGACGTG-1-MSC5-BTI',\n 'CELL_AAACGCTAGAGATCGC-1-MSC5-BTI', 'CELL_AAACGCTAGCCTGTCG-1-MSC5-BTI',\n ...\n 'CELL_TTTGTCAAGAGCTTCT-1-POLR2A-T523',\n 'CELL_TTTGTCAAGGACACCA-1-POLR2A-T523',\n 'CELL_TTTGTCAAGGCTAGCA-1-POLR2A-T523',\n 'CELL_TTTGTCACATGAAGTA-1-POLR2A-T523',\n 'CELL_TTTGTCAGTAACGACG-1-POLR2A-T523',\n 'CELL_TTTGTCAGTAAGAGAG-1-POLR2A-T523',\n 'CELL_TTTGTCAGTCATGCAT-1-POLR2A-T523',\n 'CELL_TTTGTCAGTGGGTATG-1-POLR2A-T523',\n 'CELL_TTTGTCATCGCCAAAT-1-POLR2A-T523',\n 'CELL_TTTGTCATCGTTTATC-1-POLR2A-T523'],\n dtype='object', length=100000)] are in the [columns]"

but it still has an error. Is it supposed to be adding the "CELL" prefix?

dmiyagi commented 1 year ago

Update! After reinstalling, using your above advice re: paths, and changing my input for the single-cell by correcting the barcodes from "_" to "-" (changed for use with ecotyper) and I also noticed my single-cell input file didn't have a index column header. Just a friendly suggestion that maybe the error could be changed to be more specific?

Would love to give a few suggestions for integration with ecotyper including using the same input files (i.e. allowing annotations file to contain other metdata as long as cellID code is "CellType" would be a nice addition. Thank you for your hlep!

WubingZhang commented 1 year ago

Thank you for your suggestions. We will make the revisions and update in the next version. Thanks!