kharchenkolab / Baysor

Bayesian Segmentation of Spatial Transcriptomics Data
MIT License
142 stars 29 forks source link

Clarification on 'Warning: To many values of z. Using 2D polygons' #105

Open e-manduchi opened 7 months ago

e-manduchi commented 7 months ago

Hello, I'm running baysor v0.6.2 to get a 3D segmentation for CosMx data, where I provide as prior the cells identified by CosMx. I am getting the below warning and therefore I only get joint polygons. Can you clarify what causes this? Thank you

Info: Estimating boundary polygons [22:40:52] Warning: To many values of z. Using 2D polygons └ Baysor.Processing /home/viktor_petukhov/.julia/dev/Baysor/src/processing/data_processing/boundary_estimation.jl:143

VPetukhov commented 5 months ago

Hello, While estimating polygons, Baysor splits the data by z-stacks. However, it requires z-variable to be discrete. So the pipeline checks if length(unique(df.z)) is large or not. It's possible that you have some decimals there, which lead to too many slices. To avoid that you could round the z values.

e-manduchi commented 5 months ago

Thank you for your response. I had checked previously on this and the z values in my file are discrete, precisely they are all integers from 0 to 8.

FalkoHof commented 4 months ago

Hello, I am running into the same issue when using Baysor on Xenium data. I have mapped the continuous Z-axis values to their respective Z-stack number, which I use as input and totals to 34 stacks.

I recieve the following warning though:

[16:37:02] Info: Run Rc3b19685b
[16:37:02] Info: (2024-02-28) Run Baysor v0.6.2
[16:37:02] Info: Loading data...
[16:37:10] Info: Loaded 3736159 transcripts
[16:37:18] Info: Estimating noise level
[16:38:09] Info: Done
[16:38:42] Info: Initializing algorithm. Scale: 2.6691342692757773, scale std: 0.729571709294326, initial #components: 2490772, #molecules: 3736159.
[16:39:48] Info: Using the following additional information about molecules: [:confidence, :prior_segmentation]
[16:39:48] Info: Using 3D coordinates
[19:59:19] Info: Processing complete.
[19:59:35] Info: Estimating local colors
[20:03:45] Info: Estimating boundary polygons
[20:08:28] Warning: To many values of z. Using 2D polygons
└ Baysor.Processing /home/viktor_petukhov/.julia/dev/Baysor/src/processing/data_processing/boundary_estimation.jl:143

I was wondering now about two things:

  1. Does this only affect the polygon output and is the segmentation still happening in 3D space?
  2. Does it make more sense to use number of the z-stack as input, or would it be better to just use the integer rounded Z-location as input to keep the scale on the same distance metric (microns) as input?

I also had a quick look at the logical check that is performed before raising the warning if length(z_vals) > (size(pos_data, 1) / 100) and it wasnt quite clear to me what it actually does and why this is performed. From this this line, which is called a bit before, it seems that size(pos_data, 1) will the number of dimensions of the data (2D, or 3D)? So if I am not mistaken the logical check compares the number or z-stacks against the number of dimensions divided by 100? Wouldn't that always default to false?

Best wishes & many thanks! Falko

Yifan-debug commented 1 week ago

Hi, I'm experiencing the same issue, and my z-stack ranges from 1 to 13. Please let me know if there are any updates on this issue.