HumanCellAtlas / metadata-schema

This repo is for the metadata schemas associated with the HCA
Apache License 2.0
64 stars 32 forks source link

Ensure metadata is in place for Spatial Transcriptomics #758

Open hewgreen opened 5 years ago

hewgreen commented 5 years ago

We are certain that we will be receiving a new data type. Spatial Transcriptomics (the specific technique) will be carried out by Anna Wilbrey-Clark (Sanger). There is some data available now for us to use to scope out the metadata we need to capture this experiment type.

Note, data is coming from CZI award (Joakim Lundeberg) and eventually from MRC award.

Understanding metadata requirements:

Add metadata:

hewgreen commented 5 years ago

Requirements:

  1. Attach image to tissue sample (specimen from organism)

Current infrastructure: Supplementary files (json) are not linked (not in links.json) and are therefore included in every bundle in that project. If there are multiple samples per submission we would need to be able to link files directly to the samples.

However, if the image file is considered to be a data file one analysis process could give is the file and the sequencing together which allows us to assign these images to specific bundles. We need to understand what a bundle will look like and therefore should understand how files are demuxed.

Q: Are files typically sent to you demuxed per spot on chip or per chip? Q: Is genome alignment done on the whole library prep prior to mapping to spot on the chip or are reads split by barcode and then aligned? Q: How many samples are typically imaged per project and per submission?

  1. Capture histological information from the image.

Q: What histological information is captured from the image? Q:how is the image stained before imaging?

  1. Overlay the cell tissue image and the gene expression data in a later step

Q: How is the image grid referenced? Q: What is the best term for a spot on the chip (potential synonyms are: capture probe, barcode, probe sequence, square etc) Q: What format is expected to map this grid reference (loc on image) to the spot on the chip sequence/barcode that is sequenced? Q: Are the chips custom made or can we just capture product information about the commercial chip used by a lab?

  1. Capture the permeabilisation, library prep and dissociation method (on the chip)

Q: Are there any aspects of the spatial transcriptomics method for permeabilisation, library prep and dissociation that varies each time you perform the process?

hewgreen commented 5 years ago

Handing this over now. Research done so far:

Response from Michaela Asp in Joakim's group Spatial transcriptomics metadata.docx: https://drive.google.com/drive/folders/0B-rEFaPQ8v3GOElwSjNDcEs2UVU

and some example metadata Example_ST_metadata.xlsx: https://drive.google.com/drive/folders/0B-rEFaPQ8v3GOElwSjNDcEs2UVU

Response to some questions from Anna Wilbrey-Clark: https://docs.google.com/document/d/1B8OarYdOem1heUWTE_o2BOoP7M2B6H586QFpmBRMPXY/edit

lauraclarke commented 5 years ago

@malloryfreeberg @zperova is this still in hand?

zperova commented 5 years ago

@lauraclarke I still plan to do the presentation but the rest is waiting for Enrique. We can make a new ticket for that, or leave this for Enrique as well.

zperova commented 5 years ago

an update - I am working on it now