dcjones / proseg

Probabilistic cell segmentation for in situ spatial transcriptomics
Other
38 stars 1 forks source link

MERSCOPE Default Config #34

Open marsdenl opened 1 week ago

marsdenl commented 1 week ago

Hey @dcjones, thank you for this great tool. It works quite well on my MERSCOPE samples but I wanted to tweak it a bit for further optimisation. My first segmentation round was using the default --merscope preset. Would you mind listing the default options/arguments in the merscope preset so I know what settings to tweak?

Thank you!

Luc

dcjones commented 1 week ago

Hi Luc,

There's really no attempt at this point to tweak default parameters based on the platform. That may change in the future with more testing, but all --merscope does now is tell proseg what the input file looks like.

There are various thing you can tweak, which I should try to document more. Probably the most impactful are:

  1. Transcript repositioning parameters:
    • --diffusion-probability: larger allows more transcripts to be repositioned
    • --diffusion-sigma-far: larger allows transcripts to be moved further
  2. Voxel size and sampling schedule
    • --initial-voxel-size: voxel size in micrometers
    • --schedule: number of iterations, or a comma separated list of counts where voxel size is halved between rounds
      1. Prior segmentation
    • --nuclear-reassignment-prob: lower to enforce conforming to the initial nuclear segmentation
marsdenl commented 1 week ago

That's already super useful to know, thanks for the information :)

marsdenl commented 1 day ago

Hey @dcjones. Thanks again for listing the settings above. Do you expect them to change the size of the masks generated? I've been having the issue that a single cell (based on transcript density and DAPI stain) seems to be receiving multiple small mass instead of a single large covering the entire diameter (see 2 examples below): combined

Do you have any recommendations for adjusting input settings to correct for that?

I was also wondering for perimeter bound, what is the value range expected? Ie how high can we go?

Thanks for your help :)

Luc

dcjones commented 1 day ago

Hi Luc,

These polygons look a little peculiar to me. I'm not sure what you're using to plot these, but it's doing some sort of simplification or modification of the proseg polygons. Proseg is voxel based, so unmodified polygons should only have straight edges perpendicular to the axes. It's possible the polygons you are getting are already better than they appear here.

Also make sure you are plotting the consensus 2d polygons that proseg outputs, not a particular voxel layer.

marsdenl commented 21 hours ago

I used the Vizgen Visualiser and converted the proseg .geojson to Vizgen's .parquet file required using the vizgen VPT post processing tool.

Perhaps some modification occurs during these steps so I also plotted the polygons in R: geojson <- st_read("/Users/lucmarsden/PROSEG/Region0/cell-polygons.geojson") transcript_met <- read.csv("/Users/lucmarsden/PROSEG/Region0/transcript-metadata.csv") cell_meta <- read.csv("/Users/lucmarsden/PROSEG/Region0/cell-metadata.csv") st_crs(geojson) <- NA

cell_by_fov <- split(cell_meta$cell, cell_meta$fov) cells <- unlist(cell_by_fov['60'])

filt_polygons = subset(geojson, cell %in% cells) filt_transcripts = subset(transcript_met, fov == 60)

ggplot() + geom_sf(data = filt_polygons, color = "red") + geom_point(data = filt_transcripts, aes(x = x, y = y), size = 0.0001, alpha = 0.1) + theme_minimal()

conbined

And python: import geopandas as gpd import matplotlib.pyplot as plt import pandas as pd

geojson_file = "/Users/lucmarsden/PROSEG/Region0/cell-polygons.geojson" gdf = gpd.read_file(geojson_file) gdf = gdf.set_crs(None, allow_override=True)

transcript_file = "/Users/lucmarsden/PROSEG/Region0/transcript-metadata.csv" transcripts = pd.read_csv(transcript_file)

cell_meta_file = "/Users/lucmarsden/PROSEG/Region0/cell-metadata.csv" cell_meta = pd.read_csv(cell_meta_file)

target_fov = 60

cells_by_fov = cell_meta[cell_meta['fov'] == target_fov]['cell'] filt_polygons = gdf[gdf['cell'].isin(cells_by_fov)] filt_transcripts = transcripts[transcripts['fov'] == target_fov]

fig, ax = plt.subplots(figsize=(10, 10)) filt_polygons.plot(ax=ax, color='lightblue', edgecolor='black') ax.scatter(filt_transcripts['x'], filt_transcripts['y'], color='red', s=1, alpha=0.5) ax.et_aspect('auto') plt.title(f"Polygons and Transcripts for FOV {target_fov}") plt.show()

fov60

Do they look peculiar still? I still get multiple small polygons per single cell...

Cheers