Sage-Bionetworks / sysbioDCCjsonschemas

SysBio DCC JSON schemas
1 stars 7 forks source link

New template: multiome #139

Closed pitviper6 closed 2 years ago

pitviper6 commented 2 years ago

Requested by: Eitan Kaplan, Allen Institute For: SEA-AD study (CDCP)

Description: They are producing scRNAseq and scATACseq data. They are also doing both using the same cell using the 10x Multiome method: https://www.10xgenomics.com/products/single-cell-multiome-atac-plus-gene-expression

So far, I have sent them the list of keys we have in the scRNAseq and scATACseq templates with a request to indicate which we should include, and if there are any we should add.

We'll also need to add multiome to the assay key. Not sure if we should use 10x multiome, but it seems to me that 10x is only one method for multiome-ing, so maybe keep to the more general label and 10x is the libraryPrep

List of values I sent them to evaluate: assay specimenID platform RIN rnaBatch libraryBatch sequencingBatch libraryPrep libraryPreparationMethod libraryType sampleBarcode isStranded readStrandOrigin readLength runType totalReads validBarcodeReads numberCells medianGenes medianUMIs assay referenceSet transposaseBatch libraryBatch pcrCycles sequencingBatch meanCoverage meanGCContent chromiumSampleIndex Q30BasesinRead1 Q30BasesinRead2

avanlinden commented 2 years ago

@pitviper6 I'm interested to hear what Eitan says about the metadata keys. I think we should maybe game this out more before we make a template -- is "multiome" really an assay, or is "assay" a multi-value annotation in this case that includes "scRNAseq, scATACseq"? What happens if we make a "mulitome" template but then next month someone comes with data that's a combination of two other sequencing assays?

I feel like "multiome" might be more of a marketing term than a useful scientific distinction. We still need some metadata additions but I'm not sure exactly what form they should take.

pitviper6 commented 2 years ago

@avanlinden @tmzintel Here's the description of the experiment from the data contributor: 10xMultiome processing: protocols.io forthcoming 10x guidelines: Chromium Next GEM Single Cell Multiome ATAC + Gene Expression User Guide Rev A 201120 10xMultiome chip loading was performed as per 10x guidelines. Nuclei concentration was calculated either manually or using the NC3000 NucleoCounter. For most loads, 16,000 total nuclei were loaded into each port, typically 2-4 ports per chip, with an expected capture count of ~10,000. As per manufacturer protocol, total volume of nuclei was brought up to 5.0ul with 1x Nuclei Buffer. Low retention tips were used for pipetting sample. Care was taken to triturate the nuclei suspension 10 times using a P200 with a low-retention tip. Transposition reaction was performed at 37C for 1hr then immediately loaded onto the chip according to 10x guidelines. Care was taken throughout the loading to not introduce air bubbles. Samples were stored after quenching at –80C until Pre-PCR. 10xMultiome Pre-PCR was done according to 10x guidelines. Low retention tips were used throughout the process whenever pipetting sample. A standard 7 cycles for PCR was used on all samples regardless of nuclei load count. Typically, samples were processed through cDNA amplification (RNASeq) and ATAC Library the same day as Pre-PCR. 10xMultiome RNASeq cDNA amplification was done according to 10x guidelines. Most samples were amplified with 8 cycles of PCR, resulting in an average of 16ng/ul (for an average input of 160ng into library construction). Low retention tips were used for sample pipetting before PCR. Samples were quantitated using Picogreen Assay. Samples were visualized using Agilent’s Fragment Analyzer and were passed if product above 400bp was detected (desired range 400-6000bp). Samples were failed if no product was detected, or if product was degraded and did not proceed to library construction or sequencing. Samples were stored at –20C until library construction. 10xMultiome RNASeq library construction was done according to 10x guidelines, using Dual Index plate TT set A. 10ul of unnormalized cDNA was used as input into the library reaction and PCR cycles were determined based on this input. Generally, 10-14 cycles were used for library construction and final sample volume was 35ul. Samples were quantitated using Picogreen Assay. Samples were visualized and sized using Agilent’s Fragment Analyzer and were passed if product size was between 400-550bp and concentration was above 10nM. Generally, samples yielded 500-1500ng, resulting in 60-300nM concentration (140nM average) and were on average 470bp. Samples were failed and not sequenced if no product was seen, if sizing was not in desired range, or if concentration was below 10nM. Samples were stored at –20C until normalization and pooling for sequencing. 10xMultiome ATAC library construction was done according to 10x guidelines, using Single Index Plate N Set A. Following 10x guidelines, 40ul of sample was used as input into the library reaction and PCR cycles were determined based on expected Nuclei recovery. Generally, 7 cycles were used for library construction and final sample volume was 20ul. Samples were quantitated using Picogreen Assay. Samples were visualized using Agilent’s Fragment Analyzer and were passed if morphology of nucleosome free, mononucleosome, dinucleosome, and multinucleated peaks were as expected according to 10x guidelines, and concentration was above 10nM. Generally, samples yielded 1000-2000ng, resulting in 80-150nM concentration (average 111nM). Samples were failed and not sequenced if no product was seen, if peak morphology was aberrant, or if concentration was below 10nM. 10xMultiome RNASeq Sequencing was performed using Illumina’s NovaSeqS4_v1.5 instrument and chemistry. Library samples were normalized individually to 10nM, then pooled to a target of 120,000 reads per nucleus (generally 10-12 libraries depending on cells/library). Sequencing was performed at either Northwest Genomics Center at University of Washington or at SeqMatic. Fastq files were received and aligned using ARC2.0. Libraries were assessed for number of nuclei detected, median gene detection for the library, and % doublets. Libraries that had significantly fewer nuclei detected than expected or had low median gene detection (<2000) were failed. 10xMultiome ATAC Sequencing was performed using NovaSeqS4_v1.5 instrument and chemistry. Library samples were normalized individually to 10nM, then pooled to a target of 85,000-120,000 reads per nucleus (generally 10-16 libraries depending on cells/library). Sequencing was performed at either Northwest Genomics Center at University of Washington or at SeqMatic. Fastq files were received and aligned using ARC2.0. Libraries were assessed for number of nuclei detected and TSA enrichment.

pitviper6 commented 2 years ago

Instead of creating a new template, we are having the data contributor use the ATACseq template and add the columns he needs from the scRNAseq template.