kharchenkolab / Baysor

Bayesian Segmentation of Spatial Transcriptomics Data
https://kharchenkolab.github.io/Baysor/
MIT License
155 stars 31 forks source link

Input data format #23

Closed fengweimin-maker closed 2 months ago

fengweimin-maker commented 3 years ago

@Hi,author, I'm trying to run Baysor with tiff and the MOLECULES_CSV contain information about x and y positions and gene assignment for each molecule, command just like the follow: baysor run -p -c ./Adult_data/adult.toml -o ./output_adult_dapi -p ./Adult_data/molecules_2.csv ./Adult_data/segmentation.tiff

But I don't know why it sent an error . May be there are Non-symmetric matrix in my data? I got the raw data why it has Non-symmetric matrix ? image

I would appreciate it if you could reply soon! Thank you!

VPetukhov commented 2 years ago

Hi @fengweimin-maker , Baysor gets NaNs in the covariance matrix, so it's not about it being symmetric. That's suspicious, I had put a lot of effort to check all possible causes for that... Any chance you can share a piece of data?

fengweimin-maker commented 2 years ago

Hi@VPetukhov I'm sorry, our data has not been submitted at present, and maybe it can be shared later. I will get in touch with you when our data publish.

VPetukhov commented 2 years ago

That's totally understandable. If you can share the full log, and your config, I can check if there are some configuration errors. And if you can post just first several lines from the molecules_2.csv file it could also help to find the problem.

fengweimin-maker commented 2 years ago

@VPetukhov Thank you very much for solving this problem.This is the config file segmentation_log.log and the first hundred rows of molecules_2.csv molecules_2_100_rows.csv Also, our tif image is a gray image and I changed it into binary image (as input tif) by the follow script:

import cv2
import numpy as np
import matplotlib.pyplot as plt

image = cv2.imread('segmentation.raw.tif',cv2.IMREAD_GRAYSCALE)
def binarzation(img,th=0.00000000001):
    bin=np.copy(img)
    bin[img >=th] =1
    bin[img <th] =0
    return bin

img = binarzation(image)
cv2.imwrite('image.tif', img)
VPetukhov commented 2 years ago

@fengweimin-maker , the log shows scale std: 0.0, and most likely that is the cause of the problem. It is probably caused by the gray image, indeed. Baysor does require either a segmentation mask (integer) or binary mask, so if you supply raw intensities, Baysor would think that the segments are super small and that would result in such behavior. Particularly, you may see that scale is 0.56, while your coordinates are integer, so it's less than one pixel. Does your binary image contain actual cell segmentation mask, or is it only a mask that separates signal from background? In the second case it will not work again, and I suggest to use some proper image segmentation methods, some of which are suggested in the readme.

TODO points from my side:

fengweimin-maker commented 2 years ago

@VPetukhov Thank you very much for your advance. There are no coordinates in gray image becasue it is nucleic acid staining(ssDNA) image taken by microscope and joined together by ImageJ.It can be seen nuclei one by one in the gray image. But when I changed it into binary image, it only can be seen the separates signal from background.

However, our gray image size is the same as the coordinates in molecules_2.csv after registration(the location in gray image can match the coordinates in molecules_2.csv ). So the scale std: 0.0 is ok? What the fuunction of scale? Is the scale for the image matching the molecules data just like the HE image match 10X data coordinates?

Actaully, the neulei in our the gray image were segmented by other published algorithm, but I think baysor is the truly way for segmenting single neulei for considering the gene-gene relations as well insteading of only considering the image, so I need to try baysor. So I confuse of the scale std, how can I run baysor for the input file with our gray image?

VPetukhov commented 2 years ago

The scale parameter is the expected radius of a cell (in pixels, in your case), and scale-std is how much cell sizes are allowed to deviate from it. I don't see any realistic scenario where setting scale-std=0.0 would make sense, as it implies that all cells have exactly the same size. How does Baysor work when you provide a tiff with nuclei segmentation mask? Does it determine the scale better?

fengweimin-maker commented 2 years ago

It is impossible all cells have exactly the same size. Maybe I don't explain clearly. I mean our gray tif can match the coordinates in molecules_2.csv after registration.

I don't understand why the scale in our tif (mask that separates signal from background ) is too small? It can be try Baysor work with nuclei segmentation mask, but I want to only use separates signal mask for the reason that I want to compare both of them.

Thank you!