Sage-Bionetworks / synapseAnnotations

Sage Bionetworks derived standards for annotating content in Synapse.
MIT License
12 stars 21 forks source link

what dataType to go with assay=CHiP-seq? #244

Closed kdaily closed 6 years ago

kdaily commented 7 years ago

We looked at ENCODE - they have what we would call assayTarget = histone and assayTarget = transcription factor. We currently have assayTarget = H3K9blah. The question remaining is which is more useful?

kdaily commented 7 years ago

Of note in PEC we currently have histone and transcription factor targets (CTCF).

kdaily commented 7 years ago

@amapeters and @sgosline please weigh in!

sgosline commented 7 years ago

Wait is this a question about assayTarget or dataType?

sgosline commented 7 years ago

i think the specific histone mark is helpful, I mean k27me3 and k4me3 are diametrically opposed to one another. But I don't have any of this data in my communities so I'm purely arguing from a scientific standpoint.

sgosline commented 7 years ago

If we're talking about dataType then i think chromatinActivity is appropriate

amapeters commented 7 years ago

We use assayTarget to specify which histone mark, TF, or input

amapeters commented 7 years ago

If we use assayTarget - histone, we need an additional level to specify which mark. I prefer it the way it is now

sgosline commented 7 years ago

I agree with @amapeters

kdaily commented 7 years ago

@sgosline the question @allaway raised is that chromatinActivity is not an appropriate value for ChIP-Seq applied to transcription factors, for example.

sgosline commented 7 years ago

I disagree - regardless of how you believe TFs to be altering the confirmation of the chromatin alongside histones and molecules like CTCF (which we once called a TF), they are binding to chromatin and altering the activity around them.

amapeters commented 7 years ago

Is 'chromatinState' more accurate. Or, we just call it 'epigenomics'?

sgosline commented 7 years ago

chromatinState seems so similar it's not worth the change, though I don't mind it if it makes things easier.

Epigenomics is still a somewhat contested term (I was not allowed to submit it in the title of my 2015 paper for example) last time I checked and can refer to any non-transcriptional regulatory mechanisms such as NMD, miRNA regulation, etc.

amapeters commented 7 years ago

chromatinState would also be appropriate for ATACseq (i.e., probing open chromatin)

sgosline commented 7 years ago

Sure!

kdaily commented 7 years ago

Not all TFs are chromatin associated, as CTCF is though?

sgosline commented 7 years ago

I feel like I'm missing the issue here - If you are performing ChIP-Seq you are by definition assaying the chromatin state/activity, regardless of how the TF is directly interacting with it and/or affecting it.

kdaily commented 7 years ago

Very true! I thought the chromatin in chromatinActivity was introduced to described the target - all of our ChIP-Seq data is histone/chromatin modification related.

If I was coming into a dataset and was looking for TF-binding datasets, knowing that ChIP-Seq is a common technology for that, my first inclination would not be to use chromatinActivity as a way to find that.

Basically comes back to our 'technical description' versus 'intent' issues with other terms.

sgosline commented 7 years ago

I'm not entirely sure, if you read through the ENCODE papers it becomes clear that the model of a transcription factor merely activating or repressing transcription is over simplified. So if your intent is to uncover the complete role of TFs in gene regulation you need a lot more than ChIP-Seq data...

kdaily commented 7 years ago

ENCODE still thinks it's useful (?) to differentiate in their data between histone and non-histone ChIP-Seq data, per the way they annotate.

sgosline commented 7 years ago

If I read correctly, so do we, by specifying the assayTarget.

amapeters commented 7 years ago

Yes, I thought we decided that already. assay - ChIPseq, assayTarget - specific histone mark or TF

amapeters commented 7 years ago
kdaily commented 7 years ago

Their filter is Assay Category = 'DNA binding', then Target of Assay = ['histone', 'transcription factor', ...].

https://www.encodeproject.org/matrix/?type=Experiment&status=released&assay_slims=DNA+binding&target.investigated_as=histone

kdaily commented 7 years ago

The question is (as @sgosline is getting at) if there is a meaningful distinction between datasets interrogating histone modifications and those that aren't. With the current solution, If I want to find all of either one exclusively, I'm going to have to enumerate histone marks or TF names.

ENCODE seemed to think that it's something users want to be able to do. Not that ENCODE is always right, but we have looked to them as a model in the past.

sgosline commented 6 years ago

Not in complete agreement but can re-open when we have more specific use case for transcription factor searches. Decision is to use chromatinActivity