chanzuckerberg / cellxgene

An interactive explorer for single-cell transcriptomics data
https://chanzuckerberg.github.io/cellxgene/
MIT License
634 stars 119 forks source link

Visium data visualization: enabling histology image to be overlaid and co-zoommed[FEATURE REQUEST] #2056

Closed brianpenghe closed 1 year ago

brianpenghe commented 3 years ago

I have 10X Visium data for a spatial transcriptomics project. Although I'm able to use the spatial coordinates for the cellxgene embedding, I'm not able to put the image on it to co-zoom with the spots. Having the image beneath the spots can really help us to know where cells are on the tissue (edges? folded skin pieces? blood vessels? nerves?) The feature allowing an image to be co-zoomed with the embedded data points can also help us make highlights on the embedding (an arrow, a dotted circle of a group of cells). Hope it's not too big an ask.

Thanks

colinmegill commented 3 years ago

Hi @brianpenghe — thanks for filing this. We've discussed this over the course of development of the visualization platform and decided against rendering images underneath the scatterplot as it does add significant complexity to the system. There aren't, at present, any plans to revisit that functionality.

I will leave it to @ambrosejcarr re: whether we want to leave this issue open to collect views from others who might want to strongly advocate for this feature.

colinmegill commented 3 years ago

@brianpenghe I will pass on here that we have seen users who are using a spatial coords in the embedding use coloring by categorical metadata to highlight various structures.

ambrosejcarr commented 3 years ago

Thanks for the feature request @brianpenghe . @colinmegill - yes, let's leave this open to see if there is additional interest.

brianpenghe commented 3 years ago

@brianpenghe I will pass on here that we have seen users who are using a spatial coords in the embedding use coloring by categorical metadata to highlight various structures.

Yes that helps a little. But it would really help a lot if a high-resolution image can be there because it has the most precise boundaries between histological structures

sidlawrence commented 3 years ago

Dear cellxgene team, Firstly thanks for creating such a helpful analysis and collaboration tool. It is brilliant. I agree with @brianpenghe - this would be a really helpful feature for visualising data, especially between collaborating teams with overlapping tissues of interest. thanks again for making cellxgene!

elo073 commented 3 years ago

Such a good idea! Would definitely speed things up and help sharing

vitkl commented 3 years ago

I also think this feature would be very useful - especially if allowing to plot continuous adata.obs / adata.obsm columns such as output of cell2location (https://github.com/BayraktarLab/cell2location)

Without the image it is quite hard to interpret spatial gene expression and cell locations for tissues that do not have stereotyped morphology.

Thanks for the great tool!

ambrosejcarr commented 3 years ago

I appreciate the kind words and interest from @brianpenghe @sidlawrence @vitkl and @elo073. This is a very interesting feature request!

My current mental model for where cellxgene fits into the spatial analysis workflow is after you've finished analyzing and featurizing spatial properties (e.g. you've encoded the cells the blood vessels overlap, or the distance of each cell to the nearest blood vessel in obs). To extract those features interactively from the image we'd recommend a tool like napari cc @sofroniewn

Extending the example, if you've marked which cells are on/adjacent to a blood vessel in obs, cellxgene can display that information the same way it displays cell type information. This workflow has the added benefit of encoding each spatial feature as a covariate which is available for any downstream statistical analyses you want to run on the data.

The really neat thing about featurizing the spatial properties is that cellxgene lets you rapidly review how the spatial covariate influences the gene expression by flipping back and forth between spatial x/y embeddings and UMAP and tSNE embeddings all stored in obsm.

Would anyone in this thread be willing to share some data that I could make public? I'd be interested in putting together a quick demo to clarify what's currently possible with cellxgene, and extend the conversation to identify what other features, in addition to the overlay image, would be useful.

vitkl commented 3 years ago

@brianpenghe suggested to give you an h5ad with Visium data to play around with.

Here is a dataset (mouse brain) containing 2 sections, cell2location output as many continuous data columns in adata.obs, and clusters representing regions under 'region_cluster'. https://cell2location.cog.sanger.ac.uk/tutorial/mouse_brain_visium_results/LocationModelLinearDependentWMultiExperiment_59clusters_5396locations_12809genes/sp_with_clusters.h5ad

brianpenghe commented 3 years ago

@ambrosejcarr I've been able to use cellxgene's many useful functions. The only inconvenient part now is that I have to open an H&E + Visium spots image, and put it next to the web browser to remind myself where specific structure ( bones and nerves etc. ) should be on the spatial embedding...

vitkl commented 3 years ago

We are not trying to do advanced things like annotating the Visium data spots using the features in histology image. We just want to see the gene expression, cell type locations, etc in the context of the image.

colinmegill commented 3 years ago

@sidlawrence @vitkl @elo073 thanks for using cellxgene and for such helpful feedback!

ambrosejcarr commented 3 years ago

@ambrosejcarr I've been able to use cellxgene's many useful functions. The only inconvenient part now is that I have to open an H&E + Visium spots image, and put it next to the web browser to remind myself where specific structure ( bones and nerves etc. ) should be on the spatial embedding...

@brianpenghe Got it. Thank you for the clarification. I will still put together a bit of a demo. Our documentation doesn't have good examples of use with spatial datasets. I appreciate you sharing the example data!

prete commented 3 years ago

[technical off topic]

@ambrosejcarr I've made a super naive attempt at this, is this something @brianpenghe would be interested in having? (don't mind the image not matching the embedding)

spatial

Started with scanpy.read_visium that gets the spatial image and spatial coords (space ranger output?) into the file and used that as my sample input.

On the server side, I added a new endpoint to the API that basically does:

spatial = data_adaptor.data.uns["spatial"]
library_id = list(spatial)[0]
response_image = io.BytesIO()
matplotlib.pyplot.imsave(response_image, spatial[library_id]["images"]["hires"])
response_image.seek(0)
return send_file(response_image, attachment_filename="spatial.png", mimetype="image/png")

On the client side I mostly edited graph.js, adding a new fetch to the async properties

spatialImage = await this.loadTextureFromUrl('/api/v0.2/spatial/image');

and a new draw function that creates the regl texture using the fetched image

    drawSpatialImage({
      projView: projView,
      spatialTexture: regl.texture({data: spatialImage})
    })

I finally draw a two big triangles and use the texture to draw the image in the 3d space. I suck at webgl so my vertex/fragment shaders are awful, but you get the idea.

If the embedding uses row/column pixel coordinate then something like the scaling factor that converts pixel positions in the original full-resolution image to pixel positions of the embedded image could be used to fix the alignment? or am I just talking crazy?

brianpenghe commented 3 years ago

Wow this is exactly what we need!!! Because 10X loupe browser can't display cell type signature scores (continuous score). Cellxgene would be the only way we can share our data with collaborators if this works!

Thank you very much!

colinmegill commented 3 years ago

You guys are amazing. Super cool work.

baohongz commented 3 years ago

Is this piece of code incorporated into cellxgene source code?

MaximilianLombardo commented 3 years ago

@baohongz

At the moment the (very cool) implementation from @prete is not incorporated into core cellxgene. We are chatting internally about how to balance this feature request with the current roadmap and expect to provide some updates on this thread/feature soon (towards the end of next week).

giovp commented 3 years ago

Hi all,

jumping on this thread a bit late, would be indeed super cool to have cellxgene to visualize visium data! Probably you solved issues already, but re sc.read_visium, indeed it loads space ranger output, but there are couple of things to consider (re results of https://github.com/chanzuckerberg/cellxgene/issues/2056#issuecomment-778169146)

Finally, apologies for shamelessly plug a scanpy extension for spatial data we just released: squidpy 🦑 We provide several tutorials and also an interactive visualization with the awesome napari: https://squidpy.readthedocs.io/en/latest/external_tutorials/tutorial_napari.html https://squidpy.readthedocs.io/en/latest/tutorials.html

baohongz commented 3 years ago

You can use cellxgene VIP to view spatial transcriptomics data.

Please follow this example to generate h5ad file for your data,

https://github.com/interactivereport/cellxgene_VIP/blob/master/notebook/spatial_transcriptomics.ipynb

You could merge data from multiple slices together. After that, you can spin off an instance like regular h5ad file.

From GUI, you can then overlay images to spatial layout.

baohongz commented 2 years ago

In the latest cellxgene VIP,

Screen Shot 2021-12-12 at 3 13 18 PM

Figure 3 | Advanced features of visualizing and processing of spatial transcriptomics data accessed by “Spatial transcriptomics” menu. (a) overlapped view of merged spatial embeddings colored by clusters and histological images of multiple samples. The image can be loaded or removed by clicking on “Get Spatial Image” or “Remove Image” button, respectively. Overlapping order, opacity of embedding and image can be adjusted to obtain the optimal view. (b) Cumulative selection option is chosen in “Global Setting” to allow selections of cells from non-adjacent regions or multiple samples as shown by the red and blue irregular shapes drawn by lasso selection tool. Please note that numbers of cells in the main window refer to the current selection while numbers of cells in VIP window denote accumulated selections by multiple lasso operations. Downstream analysis such as differential gene expression can be performed between the accumulated cell groups. (c) Embedding and image could be reviewed separately by selecting “No” to “Move embedding and image together”. This way, subtle histological features obstructed in overlaid setting can be seen easily. (d) “Layout” menu provides a flexible way to re-arrange a slide by moving, rotating, flipping on X or Y coordinates as the original setting of the slide might not be the desired one, e.g., the third slide on the top row is rotated by 90 degrees and flipped on Y. A slide is activated by clicking on it or removed from the layout by clicking “X” on the top. In the end, the layout can be saved and used to create the custom merged h5ad file based on the on-screen design.

Visualization of spatial transcriptomics data. Visualization of spatial transcriptomics data. For pre-processing, we used Scanpy read_visium function to read data of each sample from 10x genomics Space Ranger output directory. To note, in order to have the right orientation (Visium capture image border marker on the bottom right should be an empty circle) in the user interface, spatial coordinate Y axis needs to have negative values to fit coordinates design of cellxgene. Therefore, in Python environment such as Jupyter notebook, when plotting the H&E image and coordinates together, they do NOT overlap. On the other hand, when load into cellxgene VIP, they would. We also added additional capability to merge multiple spatial transcriptomics samples into one spatial embedding in a grid format. Our demo example is a 2 x 2 image merged by 4 spatial transcriptomics samples. We apply AnnData.concatenate function to merge 4 multiple AnnData objects into a single AnnData object, then add back individual spatial coordinates stored in AnnData.obsm variable with name starting as “Xspatial” and original spatial images. After merging, we perform basic QC, apply normalization, log1p transformation, select highly variable genes (HVGs), run PCA dimension reduction, generate UMAP layout and cluster spots by Leiden39. At the same time, H&E images were stitched by PIL40 Python module and then saved back to Anndata.uns[‘spatial’] as the merged spatial embedding. In summary, all these steps are performed by the below command.

python3 st_sample_merge.py -i <inputfile of data folders> -o <outputfile> [-d <grid dimension>] [-s <grid cell size>]

Where, each line in the input file holds one directory in which Space Ranger outputs including aligned images of one Visium slide are stored. Both grid dimension and cell size parameters are optional. Grid dimension is defined in row by column format (e.g. 2x3 without space) and calculated by default to fit slides in a rectangular grid. The default grid cell size is 700 that is big enough to contain 600-pixel low resolution image, which is used to minimize the size of the merged image. Spatial coordinates are visualized by cellxgene embedding layout function, which is normally used for PCA, tSNE or UMAP. In order to accommodate individual sample in the merged h5ad file, we need to use the same number of spatial spots from the merged samples for each individual sample in the spatial layout, so that any spots that do not belong to an individual sample are assigned (0,0) spatial coordinates, which are displayed as a single dot on the top left corner of the spatial layout with minimum visibility. User could use selecting subset cell function from cellxgene to exclude spots not belonging to sample of interest before plotting or performing analysis. VIP provides function to retrieve the H&E images and overlay with spatial coordinates. User can choose either the image or embedding to be on the top layer, adjust transparency, zoom in and out with image and embedding together or separately, realign image and embedding. Further, to facilitate user to compare and analyze multiple samples together, we allow one to design the desired orientation of samples by arranging, flipping, rotating images of interest into a N x M grid. Once the design is done, user can download the layout file in JSON format. This JSON file can be supplied to analyst to generate a new h5ad file with the desired layout of samples by calling the Python script that is provided in bin directory of the GitHub repository.

python3 st_h5ad_image_operation.py -j design.json -i merge.h5ad -o final.h5ad

Once new h5ad file is created, analyst can spin up another cellxgene VIP instance of the finalized data set for user to explore. This workflow encourages close collaboration between experimentalists and analysts and prevents unintended large h5ad files from proliferating if an end user is given option to generate such file on the fly through the user interface.

MaximilianLombardo commented 2 years ago

Hi there - looks like a great feature for VIP - thanks for sharing!

Cellxgene is also working on a prototype to support spatial (visium) data and we plan to incorporate the feature in a future release. We'd love to get feedback on the prototype from anyone who is interested in providing. I'll update this thread once the feature has been released.

MaximilianLombardo commented 2 years ago

Hey all and happy new year!

Just following up on my last comment in this thread.

Here is an early prototype which is specific to handling Visium data and we expect there be some bugs. Feel free to provide feedback and any other comments!

giovp commented 2 years ago

Hi @MaximilianLombardo ,

really cool prototype! I'd have couple of small comments:

Not sure it could be of help but in case you were looking for additional features, those could be some. I'll keep you posted once we have that ready!

MaximilianLombardo commented 2 years ago

This is awesome - thanks so much for this feedback - I've forwarded to some of my colleagues! cc @signechambers1 @colinmegill

cakirb commented 2 years ago

Hello everyone,

We created a guide with an example notebook for the people who want to make their Visium data compatible with visium-beta version of cellxgene: https://cellgeni.readthedocs.io/en/latest/visualisations.html#visium-data

Hope you find it useful!

BAevermann commented 2 years ago

Thanks Batu! I will take a look!

On Mon, Jun 27, 2022 at 6:37 PM Batuhan Cakir @.***> wrote:

Hello everyone,

We created a guide with an example notebook for the people who want to make their Visium data compatible with visium-beta version of cellxgene: https://cellgeni.readthedocs.io/en/latest/visualisations.html#visium-data

Hope you find it useful!

— Reply to this email directly, view it on GitHub https://github.com/chanzuckerberg/cellxgene/issues/2056#issuecomment-1167416197, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHUXLJTMXRKFFJ2QFYYGCH3VRHKD3ANCNFSM4XIYVFWA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

brianpenghe commented 1 year ago

I've made a tutorial for visualizing Visium with cellxgene. Thank you all!