EBISPOT / efo

Github repo for the Experimental Factor Ontology (EFO)
https://www.ebi.ac.uk/efo/
54 stars 14 forks source link

New term request for pseudo-bulk aggregation of sc(ATAC)-seq data #2034

Closed dagarfield closed 1 year ago

dagarfield commented 1 year ago

Here be the info:

Preferred term label

Pseudo-bulk aggregation of single-cell ATAC-seq data

Synonyms

pseudo-bulk ATAC-seq

Textual definition

This involves combining (binary or quantitative) single-cell ATAC seq data from multiple cells within the same biological sample and/or cell types, depending on the intended downstream values. Aggregation should be done within pre-defined regions of the genome (e.g. called Peaks or tiles across the genome) The aggregated profiles can then be used in the same manner as bulk ATAC-seq or DNase data)

Suggested parent term

EFO:0030023

Attribution

0000-0002-8790-797X

ghost commented 1 year ago

@dagarfield, thank you for the new term.

Do you have a reference/resource (doi, PMID, etc) that supports the text definition?

Could you also confirm that the appropriate parent term is EFO:0030023 'processed matrix generation' and not EFO:0030053 'pseudo-bulk aggregation of single-cell expression data'?

dagarfield commented 1 year ago

Hi! I had copied the EFO from another request entry, but would agree that EFO:0030053 seems to fit better. I do not have a reference....but welcome any edits :)

ghost commented 1 year ago

@dagarfield, thank you for clarifying.

1 - Can you confirm the following rewording of the text definition is accurate?

"A processed matrix generation method that involves combining (binary or quantitative) single-cell ATAC seq data from multiple cells within the same biological sample and/or cell types, depending on the intended downstream values. Ideally, aggregation is done within predefined regions of the genome e.g., using called peaks or tiles across the genome. The aggregated profiles can then be used in the same manner as bulk ATAC-seq or DNase-seq data."

2- I think this concept was mentioned here: https://doi.org/10.1038/s41467-021-26530-2 Can you confirm? If so, I will add as a reference.

3- @zoependlington, would it be appropriate to add axiom has_participant some scATAC-seq?

ghost commented 1 year ago

@dagarfield, following up on this. Can you kindly provide a reference or use case for this term so I can validate the definition?

dagarfield commented 1 year ago

Hi! Use case is easy -- we generate these sorts of datasets routinely as part of computational modeling of single-cell ATAC-seq data. The reference you provide works, but we can also go ahead and maybe cite our 2018 paper which, to my knowledge, is the first applied case of pseduo-bulk scATAC-seq data to facilitate comparisons between traditional DNase/ATAC data and single-cell studies: https://pubmed.ncbi.nlm.nih.gov/29539636/

ghost commented 1 year ago

Thank you for the feedback, @dagarfield. The following will be added to EFO. Kindly advise if additional edits are required.

Preferred term label

Pseudo-bulk aggregation of single-cell ATAC-seq data

Synonyms

pseudo-bulk ATAC-seq (related, https://orcid.org/0000-0002-8790-797X) pseduo-bulk scATAC-seq data (exact, https://orcid.org/0000-0002-8790-797X)

Textual definition

A processed matrix generation method that involves combining (binary or quantitative) single-cell ATAC seq data from multiple cells within the same biological sample and/or cell types, depending on the intended downstream values. Ideally, aggregation is done within predefined regions of the genome e.g., using called peaks or tiles across the genome. The aggregated profiles can then be used in the same manner as bulk ATAC-seq or DNase-seq data. (https://orcid.org/0000-0002-8790-797X, doi:10.1038/s41467-021-26530-2)

Suggested parent term

EFO:0030053 'pseudo-bulk aggregation of single-cell expression data'

Comment

An example of an applied case of pseduo-bulk scATAC-seq data to facilitate comparisons between traditional DNase/ATAC data and single-cell studies: https://pubmed.ncbi.nlm.nih.gov/29539636.

Attribution

https://orcid.org/0000-0002-8790-797X

ghost commented 1 year ago

@dagarfield, the following has been added to EFO and should be available in the next release:

EFO:0700017 'pseudo-bulk aggregation of single-cell ATAC-seq data'

Please reach out if any edits are required. Thank you again for the contribution.