rajewsky-lab / novosparc

BSD 3-Clause "New" or "Revised" License
125 stars 41 forks source link

How to generate the space information for the reference marker genes #38

Closed MinjieHu closed 3 years ago

MinjieHu commented 4 years ago

Hi, novosparc group,

Thanks for developing such a great package!

I am currently working with a new organism and we have already done the scRNA-seq and developed the whole mount in-situ hybridization protocol. So I want to apply the novosparc analysis into this new organism. I have went through the tutorial and document, but still can't figure out how to generate the space information for the reference genes based on in situ results. Could you give me some instruction or point me to where I can find this information?

Thanks for the help!

Minjie

nukappa commented 4 years ago

Hi Minjie,

so I assume that you already have a couple of FISH images for a number of markers, that are spatially informative, and that their overall shape is quite similar. The more you have the better, but novoSpaRc can also deal with a small number to begin with. The first step would be to digitize those images. I don't know how they look like, but you can try to read them digitally and try to create a reference atlas from them. Therefore, for every image, you need to read it digitally and somehow annotate where the expression is ON and where OFF. Binary is okay for start, later you can introduce gradients if you need. Also, that would be your target space that you will map your cells onto.

To summarize:

  1. Choose one of the images that you think represents nicely the shape of the organism. That will be your target space. Read it in python (or other) and create your target space which nothing more than a set of points with coordinates.
  2. Go through your images and digitize them one by one. The idea is to have the same number of locations with the shape in step (1) and a column denoting whether the gene is expressed or not (0 or 1). You can do this crudely (such as in the zebrafish embryo paper Satija et al. 2015, 64 bins for the whole embryo) or aim for higher resolution - depends also on the images.
  3. If you want more precision, you can introduce expression gradients either manually (for example 0, 1, 2) or automatically when reading the image.
  4. After the steps above you've constructed your atlas and can map the single-cell transcriptomes onto the target space with novosparc. Try experimenting with different number of markers, locations and cells.

Hope this helps, Nikos

MinjieHu commented 4 years ago

Hi Nikos,

Thanks for the quick respond. It indeed help! However, as I haven't never dealt with images, I probably need more help.

  1. To get the target space, I guess I can use reconstruct_tissue.py script. However, this script looks totally depend on black and white image, what's the best practice to convert the FISH image to this kind black and white image? Or is there any way to read it into target space directly based on the FISH image?
  2. Although most of the FISH images are quite similar, they still have slightly differences. I am kind confused about how to map these different FISH images into the same target space. Is there any example about how to digitize image and map them to the target space?

Minjie

MalteMederacke commented 4 years ago

Hi Minjie, I had (or have) similar troubles. I think, NovoSpaRc is based on the bdntp drosophila project, there was a website, but its since a couple of month not longer supported. But most of how they build their reference atlas is depicted in these two papers

http://dx.doi.org/10.1016/j.cell.2008.01.053 http://genomebiology.biomedcentral.com/articles/10.1186/gb-2006-7-12-r123

This is a very sophisticated way to gather spatial information and requires quite some imaging and image analysis efforts.

I am trying right now to design a 3D digital model of my target and then transferring manually, based on proportions the spatial information I gain from ISH onto it. With that I basically undergo a reduction of the shape variation. But if someone has a better Idea I am curious to know what that would be.

MinjieHu commented 4 years ago

Hi Malte,

Thanks so much for sharing this information. I will take a close look.

MalteMederacke commented 4 years ago

Hi,

I am wondering, if there is a way to build a reference atlas with ISH/FISH images and still allow it to be 'wrong' in a sense. I have the issue, that if I cover my target space with insitu information and the sum of cells that express these genes is not equal to the number of cells in my DGE than I seemingly suppress cells, that are not expressing these marker genes. Hence, the amount bins/cells I tag as expression on is greater, than the amount of cells that are actually positiv in their expression in the DGE. I have strongly differentiable domains (similar to the gap genes of drosophila) but some already further differentiated cells are in between. If I cover my target with all this 'gap' genes, then these cells are not placed in a salt and pepper pattern like I would expect, but are more or less non-existent. Reducing alpha linear to values like ~0.1 does not make a big difference. Is there a way to solve this?

nukappa commented 4 years ago

Hi @MinjieHu and @MalteMederacke

we're in the process of updating our tutorials and package to facilitate the construction of reference atlases, so more to come on this in the near future.

Regarding having FISH images and allowing to be "wrong", the different cellular numbers shouldn't matter. For instance, if you have a target space (and a reference atlas) with, say, 10,000 locations and gene1 being expressed in 5,000, but only 1,000 of your cells in the scRNAseq dataset express this gene, novoSpaRc will still map them onto all 5,000 locations. Same is true the other way around, if the reference atlas for that gene would have only 500 locations, then only those will be filled by the 1,000 cells.

The key behind this is that novoSpaRc can map one-to-many and many-to-one, so only the relative expression in the reference atlas is relevant, not the absolute numbers.

phantom323 commented 3 years ago

Hi, I have similar troubles. I really want to know how to construct the reference atlas similar to "dge.txt" in detail.

nukappa commented 3 years ago

Hi @phantom323 have a look at my comment above.