Open altunbulakli opened 4 months ago
Hello @altunbulakli,
Indeed Sopa should run baysor, but in your case it seems it didn't. Can you show me the full log?
You should have a text file called TC_070.zarr/.sopa_cache/patches_file_baysor
, can you show me how it looks? It should contain a list of patch indices. Maybe it is empty for some reasons
Hey Quentin,
Thank you very much for your help! The patches_file_baysor looks like this, Indeed it is empty.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 21 22 23 24
Actually, it looks as expected (it just a list of patches IDs). Each ID corresponds to the name of a directory inside TC_070.zarr/.sopa_cache/baysor_boundaries
, do you have such directories?
Also, have you tried the toy example? Do you have the same issue?
Sorry for the late reply. I wanted to make sure that Baysor was installed correctly and Cellpose was working outside of the Snakemake pipeline:
However, both the Baysor only config, Cellpose only config and the Toy Example command give a similar error: Here you can find the Toy example log:
(sopa) C:\Users\altun\Documents\sopa\workflow>snakemake --config sdata_path=tuto.zarr --configfile=config/toy/uniform_cellpose.yaml --cores 1 --use-conda Building DAG of jobs... Provided cores: 1 (use --cores to define parallelism) Rules claiming more threads will be scaled down. Job stats: job count
aggregate 1 all 1 annotate 1 explorer 1 image_write 1 patchify_cellpose 1 report 1 resolve_cellpose 1 to_spatialdata 1 total 9
Select jobs to execute...
[Thu May 30 11:52:45 2024] rule to_spatialdata: output: tuto.zarr/.zgroup jobid: 5 reason: Missing output files: tuto.zarr/.zgroup resources: tmpdir=C:\Users\altun\AppData\Local\Temp, mem_mb=128000, mem_mib=122071
Activating conda environment: sopa
[INFO] (sopa.utils.data) Image of size ((4, 2048, 2048)) with 400 cells and 100 transcripts per cell
[INFO] (sopa.io.standardize) Writing the following spatialdata object to tuto.zarr:
SpatialData object with:
├── Images
│ └── 'image': SpatialImage[cyx] (4, 2048, 2048)
├── Points
│ └── 'transcripts': DataFrame with shape: (
[Thu May 30 11:52:55 2024] rule image_write: input: tuto.zarr/.zgroup output: tuto.explorer/morphology.ome.tif jobid: 7 reason: Missing output files: tuto.explorer/morphology.ome.tif; Input files updated by another job: tuto.zarr/.zgroup resources: tmpdir=C:\Users\altun\AppData\Local\Temp, mem_mb=64000, mem_mib=61036, partition=longq
Activating conda environment: sopa [WARNING] (sopa._sdata) sdata object has no cellpose boundaries and no baysor boundaries. Consider running segmentation first. [INFO] (sopa.io.explorer.images) Writing multiscale image with procedure=semi-lazy (load in memory when possible) [INFO] (sopa.io.explorer.images) (Loading image of shape (4, 2048, 2048)) in memory [INFO] (sopa.io.explorer.images) > Image of shape (4, 2048, 2048) [INFO] (sopa.io.explorer.images) > Image of shape (4, 1024, 1024) [INFO] (sopa.io.explorer.images) > Image of shape (4, 512, 512) [INFO] (sopa.io.explorer.images) > Image of shape (4, 256, 256) [INFO] (sopa.io.explorer.images) > Image of shape (4, 128, 128) [INFO] (sopa.io.explorer.images) > Image of shape (4, 64, 64) [INFO] (sopa.io.explorer.converter) Saved files in the following directory: tuto.explorer [INFO] (sopa.io.explorer.converter) You can open the experiment with 'open tuto.explorer\experiment.xenium' [Thu May 30 11:53:01 2024] Finished job 7. 2 of 9 steps (22%) done Select jobs to execute...
[Thu May 30 11:53:01 2024] checkpoint patchify_cellpose: input: tuto.zarr/.zgroup output: tuto.zarr/.sopa_cache/patches_file_image, tuto.zarr/.sopa_cache/patches jobid: 4 reason: Missing output files: tuto.zarr/.sopa_cache/patches_file_image; Input files updated by another job: tuto.zarr/.zgroup resources: tmpdir=C:\Users\altun\AppData\Local\Temp DAG of jobs will be updated after completion.
Activating conda environment: sopa [INFO] (sopa.patches.patches) 4 patches were saved in sdata['sopa_patches'] Touching output file tuto.zarr/.sopa_cache/patches. [Thu May 30 11:53:07 2024] Finished job 4. 3 of 9 steps (33%) done MissingInputException in rule resolve_cellpose in file C:\Users\altun\Documents\sopa\workflow\Snakefile, line 113: Missing input files for rule resolve_cellpose: output: tuto.zarr/.sopa_cache/cellpose_boundaries_done affected files: tuto.zarr.sopa_cache\cellpose_boundaries\3.parquet tuto.zarr.sopa_cache\cellpose_boundaries\1.parquet tuto.zarr.sopa_cache\cellpose_boundaries\2.parquet tuto.zarr.sopa_cache\cellpose_boundaries\0.parquet
And here are my logs for a Baysor only config file snakemake run:
(sopa) C:\Users\altun\Documents\sopa\workflow>snakemake --config data_path=C:/Users/altun/Desktop/Xenium_analysis_May_2024/xenium_run/TC_070 --configfile=config/xenium/baysor_new.yaml --cores 1 --use-conda
SpatialData object path set to default: C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr
To change this behavior, provide --config sdata_path=...
when running the snakemake pipeline
Building DAG of jobs...
Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
Job stats:
job count
aggregate 1 all 1 explorer 1 image_write 1 patchify_baysor 1 report 1 resolve_baysor 1 to_spatialdata 1 total 8
Select jobs to execute...
[Thu May 30 11:57:56 2024] rule to_spatialdata: input: C:/Users/altun/Desktop/Xenium_analysis_May_2024/xenium_run/TC_070 output: C:/Users/altun/Desktop/Xenium_analysis_May_2024/xenium_run/TC_070.zarr/.zgroup jobid: 4 reason: Missing output files: C:/Users/altun/Desktop/Xenium_analysis_May_2024/xenium_run/TC_070.zarr/.zgroup resources: tmpdir=C:\Users\altun\AppData\Local\Temp, mem_mb=128000, mem_mib=122071
Activating conda environment: sopa
[INFO] (sopa.io.standardize) Writing the following spatialdata object to C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr:
SpatialData object with:
├── Images
│ └── 'morphology_focus': MultiscaleSpatialImage[cyx] (1, 20598, 22888), (1, 10299, 11444), (1, 5149, 5722), (1, 2574, 2861), (1, 1287, 1430)
└── Points
└── 'transcripts': DataFrame with shape: (
[Thu May 30 11:58:49 2024] rule image_write: input: C:/Users/altun/Desktop/Xenium_analysis_May_2024/xenium_run/TC_070.zarr/.zgroup output: C:/Users/altun/Desktop/Xenium_analysis_May_2024/xenium_run/TC_070.explorer/morphology.ome.tif jobid: 6 reason: Missing output files: C:/Users/altun/Desktop/Xenium_analysis_May_2024/xenium_run/TC_070.explorer/morphology.ome.tif; Input files updated by another job: C:/Users/altun/Desktop/Xenium_analysis_May_2024/xenium_run/TC_070.zarr/.zgroup resources: tmpdir=C:\Users\altun\AppData\Local\Temp, mem_mb=64000, mem_mib=61036, partition=longq
Activating conda environment: sopa [WARNING] (sopa._sdata) sdata object has no cellpose boundaries and no baysor boundaries. Consider running segmentation first. [INFO] (sopa.io.explorer.images) Writing multiscale image with procedure=semi-lazy (load in memory when possible) [INFO] (sopa.io.explorer.images) (Loading image of shape (1, 20598, 22888)) in memory [INFO] (sopa.io.explorer.images) > Image of shape (1, 20598, 22888) [INFO] (sopa.io.explorer.images) > Image of shape (1, 10299, 11444) [INFO] (sopa.io.explorer.images) > Image of shape (1, 5149, 5722) [INFO] (sopa.io.explorer.images) > Image of shape (1, 2574, 2861) [INFO] (sopa.io.explorer.images) > Image of shape (1, 1287, 1430) [INFO] (sopa.io.explorer.images) > Image of shape (1, 643, 715) [INFO] (sopa.io.explorer.converter) Saved files in the following directory: C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.explorer [INFO] (sopa.io.explorer.converter) You can open the experiment with 'open C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.explorer\experiment.xenium' [Thu May 30 11:59:12 2024] Finished job 6. 2 of 8 steps (25%) done Select jobs to execute...
[Thu May 30 11:59:12 2024] checkpoint patchify_baysor: input: C:/Users/altun/Desktop/Xenium_analysis_May_2024/xenium_run/TC_070.zarr/.zgroup output: C:/Users/altun/Desktop/Xenium_analysis_May_2024/xenium_run/TC_070.zarr/.sopa_cache/patches_file_baysor, C:/Users/altun/Desktop/Xenium_analysis_May_2024/xenium_run/TC_070.zarr/.sopa_cache/baysor_boundaries jobid: 3 reason: Missing output files: C:/Users/altun/Desktop/Xenium_analysis_May_2024/xenium_run/TC_070.zarr/.sopa_cache/patches_file_baysor; Input files updated by another job: C:/Users/altun/Desktop/Xenium_analysis_May_2024/xenium_run/TC_070.zarr/.zgroup resources: tmpdir=C:\Users\altun\AppData\Local\Temp DAG of jobs will be updated after completion.
Activating conda environment: sopa [INFO] (sopa.patches.patches) Writing sub-CSV for baysor [########################################] | 100% Completed | 56.31 s [INFO] (sopa.patches.patches) Patches saved in directory C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries [INFO] (sopa.patches.patches) Patch 0 has < 4000 transcripts. Baysor will not be run on it. [INFO] (sopa.patches.patches) Patch 20 has < 4000 transcripts. Baysor will not be run on it. [Thu May 30 12:00:16 2024] Finished job 3. 3 of 8 steps (38%) done MissingInputException in rule resolve_baysor in file C:\Users\altun\Documents\sopa\workflow\Snakefile, line 125: Missing input files for rule resolve_baysor: output: C:/Users/altun/Desktop/Xenium_analysis_May_2024/xenium_run/TC_070.zarr/.sopa_cache/baysor_boundaries_done, C:/Users/altun/Desktop/Xenium_analysis_May_2024/xenium_run/TC_070.zarr/.sopa_cache/table affected files: C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\21\segmentation_polygons.json C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\17\segmentation_polygons.json C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\4\segmentation_polygons.json C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\24\segmentation_counts.loom C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\16\segmentation_counts.loom C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\18\segmentation_counts.loom C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\11\segmentation_counts.loom C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\23\segmentation_counts.loom C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\1\segmentation_counts.loom C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\13\segmentation_polygons.json C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\18\segmentation_polygons.json C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\16\segmentation_polygons.json C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\24\segmentation_polygons.json C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\14\segmentation_counts.loom C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\13\segmentation_counts.loom C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\11\segmentation_polygons.json C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\19\segmentation_counts.loom C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\10\segmentation_counts.loom C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\7\segmentation_counts.loom C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\15\segmentation_counts.loom C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\15\segmentation_polygons.json C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\1\segmentation_polygons.json C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\8\segmentation_counts.loom C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\17\segmentation_counts.loom C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\5\segmentation_polygons.json C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\8\segmentation_polygons.json C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\23\segmentation_polygons.json C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\22\segmentation_counts.loom C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\12\segmentation_counts.loom C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\3\segmentation_counts.loom C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\4\segmentation_counts.loom C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\19\segmentation_polygons.json C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\2\segmentation_counts.loom C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\12\segmentation_polygons.json C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\9\segmentation_counts.loom C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\9\segmentation_polygons.json C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\6\segmentation_counts.loom C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\3\segmentation_polygons.json C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\6\segmentation_polygons.json C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\5\segmentation_counts.loom C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\10\segmentation_polygons.json C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\21\segmentation_counts.loom C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\22\segmentation_polygons.json C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\2\segmentation_polygons.json C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\7\segmentation_polygons.json C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries\14\segmentation_polygons.json
At the end of this run, we have a C:\Users\altun\Desktop\Xenium_analysis_May_2024\xenium_run\TC_070.zarr.sopa_cache\baysor_boundaries folder with the 0 to 24 patches.
Here is one of the folder's contents transcripts.csv
Weirdly enough, it seems that snakemake uses an incorrect path. Indeed, in your logs I see tuto.zarr.sopa_cache\cellpose_boundaries\3.parquet
, but there is a missing \
, since it should be tuto.zarr\.sopa_cache\cellpose_boundaries\3.parquet
I don't understand why this happens, and all the other paths look correct, for instance tuto.zarr/.sopa_cache/patches
Just to make sure this is related to Snakemake, can you try to use the CLI as in this tutorial?
Yes! I was able to follow the CLI tutorial, and it seems to segment the cells, resolve them and aggregate them successfully. Apart from a small error in report.html writing. I can also visualize the tutorial in Xenium explorer.
So it seems like the Snakemake pipeline that i have (I didnt modify the Snakefile so far) seems to have an error. One step closer :)
(sopa) C:\Users\altun\Documents\sopa\workflow>sopa read . --sdata-path tuto.zarr --technology uniform
[INFO] (sopa.utils.data) Image of size ((4, 2048, 2048)) with 400 cells and 100 transcripts per cell
[INFO] (sopa.io.standardize) Writing the following spatialdata object to tuto.zarr:
SpatialData object with:
├── Images
│ └── 'image': SpatialImage[cyx] (4, 2048, 2048)
├── Points
│ └── 'transcripts': DataFrame with shape: (
(sopa) C:\Users\altun\Documents\sopa\workflow>sopa patchify image tuto.zarr --patch-width-pixel 1500 --patch-overlap-pixel 50 [INFO] (sopa.patches.patches) 4 patches were saved in sdata['sopa_patches']
(sopa) C:\Users\altun\Documents\sopa\workflow>sopa segmentation cellpose tuto.zarr --channels DAPI --diameter 35 --min-area 2000 --patch-index 0 [INFO] (sopa.segmentation.shapes) Percentage of non-geometrized cells: 0.93% (usually due to segmentation artefacts)
(sopa) C:\Users\altun\Documents\sopa\workflow>sopa segmentation cellpose tuto.zarr --channels DAPI --diameter 35 --min-area 2000 --patch-index 1 [INFO] (sopa.segmentation.shapes) Percentage of non-geometrized cells: 0.00% (usually due to segmentation artefacts)
(sopa) C:\Users\altun\Documents\sopa\workflow>sopa segmentation cellpose tuto.zarr --channels DAPI --diameter 35 --min-area 2000 --patch-index 2 [INFO] (sopa.segmentation.shapes) Percentage of non-geometrized cells: 2.30% (usually due to segmentation artefacts)
(sopa) C:\Users\altun\Documents\sopa\workflow>sopa segmentation cellpose tuto.zarr --channels DAPI --diameter 35 --min-area 2000 --patch-index 3 [INFO] (sopa.segmentation.shapes) Percentage of non-geometrized cells: 0.00% (usually due to segmentation artefacts)
(sopa) C:\Users\altun\Documents\sopa\workflow>sopa resolve cellpose tuto.zarr Reading patches: 100%|███████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 82.79it/s] [INFO] (sopa.segmentation.stainings) Found 390 total cells Resolving conflicts: 100%|███████████████████████████████████████████████████████████| 92/92 [00:00<00:00, 4842.21it/s] [INFO] (sopa.segmentation.stainings) Added 370 cell boundaries in sdata['cellpose_boundaries']
(sopa) C:\Users\altun\Documents\sopa\workflow>sopa aggregate tuto.zarr --gene-column genes --average-intensities --min-transcripts 10
[INFO] (sopa.segmentation.aggregate) Aggregating transcripts over 370 cells
[########################################] | 100% Completed | 103.88 ms
[INFO] (sopa.segmentation.aggregate) Filtering 0 cells
C:\ProgramData\miniconda3\envs\sopa\lib\site-packages\spatialdata_core_elements.py:92: UserWarning: Key cellpose_boundaries
already exists. Overwriting it.
self._check_key(key, self.keys(), self._shared_keys)
[INFO] (sopa.segmentation.aggregate) Averaging channels intensity over 370 cells with expansion 0.0
[########################################] | 100% Completed | 211.07 ms
C:\Users\altun\Documents\sopa\sopa\segmentation\aggregate.py:268: ImplicitModificationWarning: Setting element .obsm['intensities']
of view, initializing view as actual.
self.table.obsm[SopaKeys.INTENSITIES_OBSM] = pd.DataFrame(
C:\ProgramData\miniconda3\envs\sopa\lib\site-packages\spatialdata_core_elements.py:92: UserWarning: Key cellpose_boundaries
already exists. Overwriting it.
self._check_key(key, self.keys(), self._shared_keys)
(sopa) C:\Users\altun\Documents\sopa\workflow>sopa report tuto.zarr report.html [INFO] (sopa.io.report.generate) Writing general_section [INFO] (sopa.io.report.generate) Writing cell_section [INFO] (sopa.io.report.generate) Writing channel_section [INFO] (sopa.io.report.generate) Writing transcripts_section [INFO] (sopa.io.report.generate) Writing representation_section [INFO] (sopa.io.report.generate) Computing UMAP on 370 cells [INFO] (sopa.io.report.generate) Writing report to report.html ┌─────────────────────────────── Traceback (most recent call last) ────────────────────────────────┐ │ C:\Users\altun\Documents\sopa\sopa\cli\app.py:212 in report │ │ │ │ 209 │ │ │ 210 │ sdata = read_zarr_standardized(sdata_path) │ │ 211 │ │ │ > 212 │ write_report(path, sdata) │ │ 213 │ │ │ │ ┌─────────────────────────────────────────── locals ───────────────────────────────────────────┐ │ │ │ path = 'report.html' │ │ │ │ read_zarr_standardized = <function read_zarr_standardized at 0x000001BC80AC77F0> │ │ │ │ sdata = SpatialData object with: │ │ │ │ ├── Images │ │ │ │ │ └── 'image': SpatialImage[cyx] (4, 2048, 2048) │ │ │ │ ├── Points │ │ │ │ │ └── 'transcripts': DataFrame with shape: (40000, 4) (3D │ │ │ │ points) │ │ │ │ ├── Shapes │ │ │ │ │ ├── 'cellpose_boundaries': GeoDataFrame shape: (370, 1) (2D │ │ │ │ shapes) │ │ │ │ │ ├── 'cells': GeoDataFrame shape: (400, 1) (2D shapes) │ │ │ │ │ └── 'sopa_patches': GeoDataFrame shape: (4, 1) (2D shapes) │ │ │ │ └── Tables │ │ │ │ │ └── 'table': AnnData (370, 5) │ │ │ │ with coordinate systems: │ │ │ │ ▸ 'global', with elements: │ │ │ │ │ │ image (Images), transcripts (Points), cellpose_boundaries │ │ │ │ (Shapes), cells (Shapes), sopa_patches (Shapes) │ │ │ │ ▸ 'microns', with elements: │ │ │ │ │ │ transcripts (Points) │ │ │ │ sdata_path = 'tuto.zarr' │ │ │ │ write_report = <function write_report at 0x000001BC852F2A70> │ │ │ └──────────────────────────────────────────────────────────────────────────────────────────────┘ │ │ │ │ C:\Users\altun\Documents\sopa\sopa\io\report\generate.py:47 in write_report │ │ │ │ 44 │ sections = SectionBuilder(sdata).compute_sections() │ │ 45 │ │ │ 46 │ log.info(f"Writing report to {path}") │ │ > 47 │ Root(sections).write(path) │ │ 48 │ │ 49 │ │ 50 def _kdeplot_vmax_quantile(values: np.ndarray, quantile: float = 0.95): │ │ │ │ ┌─────────────────────────────────────────── locals ───────────────────────────────────────────┐ │ │ │ path = 'report.html' │ │ │ │ sdata = SpatialData object with: │ │ │ │ ├── Images │ │ │ │ │ └── 'image': SpatialImage[cyx] (4, 2048, 2048) │ │ │ │ ├── Points │ │ │ │ │ └── 'transcripts': DataFrame with shape: (40000, 4) (3D points) │ │ │ │ ├── Shapes │ │ │ │ │ ├── 'cellpose_boundaries': GeoDataFrame shape: (370, 1) (2D shapes) │ │ │ │ │ ├── 'cells': GeoDataFrame shape: (400, 1) (2D shapes) │ │ │ │ │ └── 'sopa_patches': GeoDataFrame shape: (4, 1) (2D shapes) │ │ │ │ └── Tables │ │ │ │ │ └── 'table': AnnData (370, 5) │ │ │ │ with coordinate systems: │ │ │ │ ▸ 'global', with elements: │ │ │ │ │ │ image (Images), transcripts (Points), cellpose_boundaries (Shapes), cells │ │ │ │ (Shapes), sopa_patches (Shapes) │ │ │ │ ▸ 'microns', with elements: │ │ │ │ │ │ transcripts (Points) │ │ │ │ sections = [ │ │ │ │ │ <sopa.io.report.engine.Section object at 0x000001BC857897E0>, │ │ │ │ │ <sopa.io.report.engine.Section object at 0x000001BC879D76A0>, │ │ │ │ │ <sopa.io.report.engine.Section object at 0x000001BC8A53E170>, │ │ │ │ │ <sopa.io.report.engine.Section object at 0x000001BC8A8573A0>, │ │ │ │ │ <sopa.io.report.engine.Section object at 0x000001BC8EE92740> │ │ │ │ ] │ │ │ └──────────────────────────────────────────────────────────────────────────────────────────────┘ │ │ │ │ C:\Users\altun\Documents\sopa\sopa\io\report\engine.py:264 in write │ │ │ │ 261 │ │ self.sanity_check() │ │ 262 │ │ │ │ 263 │ │ with open(path, "w") as f: │ │ > 264 │ │ │ f.write(str(self)) │ │ 265 │ │ │ 266 │ def str(self) -> str: │ │ 267 │ │ return f""" │ │ │ │ ┌───────────────────────────────── locals ─────────────────────────────────┐ │ │ │ f = <_io.TextIOWrapper name='report.html' mode='w' encoding='cp1252'> │ │ │ │ path = 'report.html' │ │ │ │ self = <sopa.io.report.engine.Root object at 0x000001BC85623400> │ │ │ └──────────────────────────────────────────────────────────────────────────┘ │ │ │ │ C:\ProgramData\miniconda3\envs\sopa\lib\encodings\cp1252.py:19 in encode │ │ │ │ 16 │ │ 17 class IncrementalEncoder(codecs.IncrementalEncoder): │ │ 18 │ def encode(self, input, final=False): │ │ > 19 │ │ return codecs.charmap_encode(input,self.errors,encoding_table)[0] │ │ 20 │ │ 21 class IncrementalDecoder(codecs.IncrementalDecoder): │ │ 22 │ def decode(self, input, final=False): │ │ │ │ ┌─────────────────────────────────────────── locals ───────────────────────────────────────────┐ │ │ │ final = False │ │ │ │ input = '\r\n <!DOCTYPE html>\r\n \r\n
\r\n <meta │ │ │ │ charset="utf-8" />\r'+342740 │ │ │ │ self = <encodings.cp1252.IncrementalEncoder object at 0x000001BC8EEAFE80> │ │ │ └──────────────────────────────────────────────────────────────────────────────────────────────┘ │ └──────────────────────────────────────────────────────────────────────────────────────────────────┘ UnicodeEncodeError: 'charmap' codec can't encode characters in position 209553-209555: character maps to(sopa) C:\Users\altun\Documents\sopa\workflow>sopa explorer write tuto.zarr --gene-column genes [INFO] (sopa.io.explorer.table) Writing table with 5 columns [INFO] (sopa.io.explorer.table) Writing 2 cell categories: region, slide [INFO] (sopa.io.explorer.shapes) Writing 370 cell polygons [INFO] (sopa.io.explorer.points) Writing 40000 transcripts [INFO] (sopa.io.explorer.points) > Level 0: 40000 transcripts [INFO] (sopa.io.explorer.points) > Level 1: 10000 transcripts [INFO] (sopa.io.explorer.images) Writing multiscale image with procedure=semi-lazy (load in memory when possible) [INFO] (sopa.io.explorer.images) (Loading image of shape (4, 2048, 2048)) in memory [INFO] (sopa.io.explorer.images) > Image of shape (4, 2048, 2048) [INFO] (sopa.io.explorer.images) > Image of shape (4, 1024, 1024) [INFO] (sopa.io.explorer.images) > Image of shape (4, 512, 512) [INFO] (sopa.io.explorer.images) > Image of shape (4, 256, 256) [INFO] (sopa.io.explorer.images) > Image of shape (4, 128, 128) [INFO] (sopa.io.explorer.images) > Image of shape (4, 64, 64) [INFO] (sopa.io.explorer.converter) Saved files in the following directory: tuto.explorer [INFO] (sopa.io.explorer.converter) You can open the experiment with 'open tuto.explorer\experiment.xenium'
Good to hear that the CLI works.
I changed something in the way to handle the paths in the snakemake pipeline. Maybe it will fix this "missing /
" issue, but since I can't reproduce the issue, I'm not sure. This will be available in the next version of sopa!
Same for the report, I just changed the file encoding, should be fixed in the next release of sopa. I will let you know when it's released!
I just released sopa==1.1.0
, it should fix at least the encoding issue, can you check?
Dear @quentinblampey and @altunbulakli,
I am receiving a similar error while trying to process a subset dataset.
Details:
I am trying to run sopa through snakemake locally on a virtual windows machine.
I have two mamba environments, one for snakemake, and one with sopa.
From the snakemake env I run snakemake --config data_path="C:\Users\jnimoca\Desktop\SOPA\sopa\workflow\data\991_subset.ome.tif" --configfile="C:\Users\jnimoca\Desktop\SOPA\sopa\workflow\ometif.yaml" --cores 10 --use-conda
It loads the image, but at checkpoint patchify_cellpose it fails claiming that MissingInputException in rule resolve_cellpose in file C:\Users\jnimoca\Desktop\SOPA\sopa\workflow\Snakefile, line 154: Missing input files for rule resolve_cellpose:
The snakefile remains unchanged from sopa repo.
Happy to provide more details if needed :)
Hello @josenimo,
Can you check if the "missing" files are really missing? E.g., does this file exist?
C:\Users\jnimoca\Desktop\SOPA\sopa\workflow\data\991_subset.ome.zarr\.sopa_cache\cellpose_boundaries\10.parquet
If the file exists, then snakemake is not detecting it for some reasons. I suspect it has something to do with how paths are handled in Windows.
Two questions to check if my assumptions are correct:
Hey @quentinblampey , thank you for your quick responses.
Update:
Happy to provide more details to help solve Windows issues.
Ok thanks for the details. I think it's an issue related to snakemake, which doesn't trigger the rules that create the input needed for the resolve_cellpose
rule
Can you tell me which version of snakemake and windows do you have?
When I have access to a windows laptop (i.e. not before late august), I'll check this and try to solve this!
Meanwhile, the CLI and the API should work fine on windows :)
snakemake version 7.32.4 Windows Server 2022 Standard All good, thank you for the info. I almost got it running on SLURM :)
@quentinblampey @josenimo I have a similar issue on Windows also. .sopa_cache only contains patches and patches_file_image and the run fails after: Writing tiles: 100%|█████████████████████▉| 4796/4797 [07:28<00:00, 10.70it/s] [INFO] (sopa.io.explorer.images) (Loading image of shape (3, 20713, 19763)) in memory [INFO] (sopa.io.explorer.images) > Image of shape (3, 20713, 19763) [INFO] (sopa.io.explorer.images) > Image of shape (3, 10356, 9881) [INFO] (sopa.io.explorer.images) > Image of shape (3, 5178, 4940) [INFO] (sopa.io.explorer.images) > Image of shape (3, 2589, 2470) [INFO] (sopa.io.explorer.images) > Image of shape (3, 1294, 1235) [INFO] (sopa.io.explorer.converter) Saved files in the following directory: Z:\Queries\Data\Batch5_region0\region_0.explorer [INFO] (sopa.io.explorer.converter) You can open the experiment with 'open Z:\Queries\Data\Batch5_region0\region_0.explorer\experiment.xenium'
Let me know if anything comes of this! Would love to use snakemake to automate things a bit.
Sorry @marsdenl, havent seen this before, can you share a bit more? you are running on snakemake right? default paramenters?
Hey @josenimo. Yes sorry for the very vague comment and thanks for taking the timer to answer. I've been trying to run it on snakemake yes with the following config: baysor_cellpose.yaml.txt
The output looks relatively fine for the steps completed:
SpatialData object path set to default: Z:\Queries\Data\Batch5_region0\region_0.zarr
To change this behavior, provide --config sdata_path=...
when running the snakemake pipeline
Building DAG of jobs...
Provided cores: 8
Rules claiming more threads will be scaled down.
Job stats:
job count
aggregate 1 all 1 explorer 1 image_write 1 patchify_baysor 1 patchify_cellpose 1 report 1 resolve_baysor 1 resolve_cellpose 1 to_spatialdata 1 total 10
Select jobs to execute...
[Mon Sep 16 13:28:41 2024] rule to_spatialdata: input: Z:/Queries/Data/Batch5_region0/region_0 output: Z:/Queries/Data/Batch5_region0/region_0.zarr/.zgroup jobid: 4 reason: Missing output files: Z:/Queries/Data/Batch5_region0/region_0.zarr/.zgroup resources: tmpdir=C:\Users\marsdenl\AppData\Local\Temp, mem_mb=128000, mem_mib=122071
Activating conda environment: sopa
C:\Users\marsdenl\AppData\Local\miniconda3\envs\sopa\lib\functools.py:926: UserWarning: The index of the dataframe is not monotonic increasing. It is recommended to sort the data to adjust the order of the index before calling .parse() to avoid possible problems due to unknown divisions
return method.get(obj, cls)(*args, **kwargs)
INFO The column "global_x" has now been renamed to "x"; the column "x" was already present in the dataframe, and
will be dropped.
INFO The column "global_y" has now been renamed to "y"; the column "y" was already present in the dataframe, and
will be dropped.
[INFO] (sopa.io.standardize) Writing the following spatialdata object to Z:\Queries\Data\Batch5_region0\region_0.zarr:
SpatialData object
├── Images
│ └── 'Batch5_region0_region_0_z3': DataTree[cyx] (3, 41426, 39526), (3, 20713, 19763), (3, 10356, 9881), (3, 5178, 4940), (3, 2589, 2470)
└── Points
└── 'Batch5_region0_region_0_transcripts': DataFrame with shape: (
[Mon Sep 16 13:34:29 2024] checkpoint patchify_cellpose: input: Z:/Queries/Data/Batch5_region0/region_0.zarr/.zgroup output: Z:/Queries/Data/Batch5_region0/region_0.zarr/.sopa_cache/patches_file_image, Z:/Queries/Data/Batch5_region0/region_0.zarr/.sopa_cache/patches jobid: 6 reason: Missing output files: Z:/Queries/Data/Batch5_region0/region_0.zarr/.sopa_cache/patches_file_image; Input files updated by another job: Z:/Queries/Data/Batch5_region0/region_0.zarr/.zgroup resources: tmpdir=C:\Users\marsdenl\AppData\Local\Temp DAG of jobs will be updated after completion.
Activating conda environment: sopa
[Mon Sep 16 13:34:29 2024] rule image_write: input: Z:/Queries/Data/Batch5_region0/region_0.zarr/.zgroup output: Z:/Queries/Data/Batch5_region0/region_0.explorer/morphology.ome.tif jobid: 8 reason: Missing output files: Z:/Queries/Data/Batch5_region0/region_0.explorer/morphology.ome.tif; Input files updated by another job: Z:/Queries/Data/Batch5_region0/region_0.zarr/.zgroup resources: tmpdir=C:\Users\marsdenl\AppData\Local\Temp, mem_mb=64000, mem_mib=61036, partition=longq
Activating conda environment: sopa [WARNING] (sopa._sdata) sdata object has no valid segmentation boundary. Consider running Sopa segmentation first. [INFO] (sopa.patches.patches) 56 patches were saved in sdata['sopa_patches'] [INFO] (sopa.io.explorer.images) Writing multiscale image with procedure=semi-lazy (load in memory when possible) Writing tiles: 0%| | 0/4797 [00:00<?, ?it/s][INFO] (sopa.io.explorer.images) > Image of shape (3, 41426, 39526) Touching output file Z:/Queries/Data/Batch5_region0/region_0.zarr/.sopa_cache/patches. [Mon Sep 16 13:37:04 2024] Finished job 6. 2 of 10 steps (20%) done Writing tiles: 0%| | 1/4797 [00:00<22:50, 3.50it/s]MissingInputException in rule resolve_cellpose in file C:\Users\marsdenl\sopa\workflow\Snakefile, line 154: Missing input files for rule resolve_cellpose: output: Z:/Queries/Data/Batch5_region0/region_0.zarr/.sopa_cache/cellpose_boundaries_done affected files: Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\1.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\51.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\31.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\24.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\44.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\3.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\4.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\18.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\0.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\54.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\2.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\46.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\33.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\52.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\17.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\39.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\6.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\29.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\49.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\9.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\7.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\20.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\41.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\13.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\47.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\55.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\43.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\42.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\5.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\40.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\25.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\11.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\15.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\16.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\34.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\12.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\22.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\53.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\37.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\35.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\38.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\27.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\48.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\50.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\36.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\21.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\23.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\14.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\26.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\10.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\45.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\30.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\8.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\28.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\32.parquet Z:\Queries\Data\Batch5_region0\region_0.zarr.sopa_cache\cellpose_boundaries\19.parquet Writing tiles: 100%|███████████████████████████████████████████████████████████▉| 4796/4797 [06:28<00:00, 12.35it/s] [INFO] (sopa.io.explorer.images) (Loading image of shape (3, 20713, 19763)) in memory [INFO] (sopa.io.explorer.images) > Image of shape (3, 20713, 19763) [INFO] (sopa.io.explorer.images) > Image of shape (3, 10356, 9881) [INFO] (sopa.io.explorer.images) > Image of shape (3, 5178, 4940) [INFO] (sopa.io.explorer.images) > Image of shape (3, 2589, 2470) [INFO] (sopa.io.explorer.images) > Image of shape (3, 1294, 1235) [INFO] (sopa.io.explorer.converter) Saved files in the following directory: Z:\Queries\Data\Batch5_region0\region_0.explorer [INFO] (sopa.io.explorer.converter) You can open the experiment with 'open Z:\Queries\Data\Batch5_region0\region_0.explorer\experiment.xenium'
But there is no folder with all the patches, no segmentation output per say, no table directory with cell x gene matrices and the .explorer file is empty. Any idea why this might be the case?
Thank you for your help :)
Hello all, Sorry for taking so long, I couldn't find a Windows that I could take for a few hours/days to reproduce the issue. One solution to move forward could be to have a short call with one of you so that we can debug in live. If you're interested, please send me an email at quentin.blampey@centralesupelec.fr!
Hello to the SOPA creators.
I am quite a beginner to Python and the Command Line Interface. I have attempted to run the Ready made Snakemake pipeline for Xenium data (with the Baysor cell segmentation feature)
However, i get the following warning and the error in the resolve_baysor section of the pipeline:
[WARNING] (sopa._sdata) sdata object has no cellpose boundaries and no baysor boundaries. Consider running segmentation first.
and the error:
Finished job 3. 3 of 8 steps (38%) done MissingInputException in rule resolve_baysor in file C:\Windows\System32\sopa\workflow\Snakefile, line 125: Missing input files for rule resolve_baysor: output: C:/xenium_run/TC_070.zarr/.sopa_cache/baysor_boundaries_done, C:/xenium_run/TC_070.zarr/.sopa_cache/table affected files:
I was thinking that the Snakemake pipeline would run Baysor segmentation automatically. I also get a similar error when i try a Cellpose only config file or a Baysor.only config file. What could be the solution here? I have installed both the cellpose package (as part of the sopa installation) and the Baysor package according to their instructions.
Thank you very much for your help :)