Sage-Bionetworks / sysbioDCCjsonschemas

SysBio DCC JSON schemas
1 stars 7 forks source link

update ATACSeq template to include scATACSeq values #137

Closed avanlinden closed 2 years ago

avanlinden commented 2 years ago

Single-cell ATAC-seq data are sparse and noisy

Article on scATACSeq methods: https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-02075-3 Existing template (ATACSeq): https://www.synapse.org/#!Synapse:syn22024364 Requested template: scATACSeq

Current ATACseq assay metadata keys:

specimenID | libraryID | assay | platform | referenceSet | transposaseBatch | libraryBatch | libraryPreparationMethod | pcrCycles | sequencingBatch | isStranded | readStrandOrigin | readLength | runType | meanCoverage | meanGCContent

Keys suggested by data contributors:

Q30 bases in read 1 Q30 bases in read 2 barcode sample index

avanlinden commented 2 years ago

Moved this issue here since it concerns metadata template updates for SysBio projects.

avanlinden commented 2 years ago

@pitviper6 Did they provide a definition for "sample index"?

pitviper6 commented 2 years ago

@avanlinden This? https://support.10xgenomics.com/single-cell-atac/sequencing/doc/specifications-sample-index-sets-for-single-cell-atac

avanlinden commented 2 years ago

@pitviper6 Oh interesting -- so basically you need the oligos in this 10x mapping file to link reads to the sample they came from. I wonder if there is something analogous used in non-10x brand assays (not sure if other scATACseq will be done with 10x kits). Also, how does the sample index differ from the barcode?

avanlinden commented 2 years ago

@tmzintel We have a contributor who is providing single-cell ATACseq data and we are discussing how to update our current bulk ATACseq assay metadata template to accommodate the single cell data. Specifically, Juliane has been asking the data contributor about what additional terms they think are necessary to capture. If you have any thoughts on useful single-cell metadata we would love to hear them!

danlu1 commented 2 years ago

Don't have much to contribute but do we need to add version? It seems for different version they use different sample index

pitviper6 commented 2 years ago

@danlu1 Please add the following keys to the current ATACseq template (https://www.synapse.org/#!Synapse:syn20768526):

avanlinden commented 2 years ago

@pitviper6 I have a couple of thoughts on these terms --

pitviper6 commented 2 years ago

@avanlinden @danlu1

  1. Agreed, let's use the existing sampleBarcode for this template
  2. Agreed, let's use camel cade
  3. I'd rather err on the general. It helps discoverability and it should be evident from the methods that it's chromium
avanlinden commented 2 years ago

@pitviper6 all sounds good to me!

pitviper6 commented 2 years ago

@danlu1 Here is the revised keys for the ATACseq template. I did some more research and decided to go with chromiumSampleIndex because I couldn't find a good definition for just sampleIndex, and all the publications seem to tie it to chromium

Please add the following keys to the current ATACseq template (https://www.synapse.org/#!Synapse:syn20768526):

danlu1 commented 2 years ago

Done.