geneontology / go-annotation

This repository hosts the tracker for issues pertaining to GO annotations.
BSD 3-Clause "New" or "Revised" License
31 stars 10 forks source link

Mapping KW-0804 and KW-0805 to transcription factors #2036

Open vanaukenk opened 5 years ago

vanaukenk commented 5 years ago

In reviewing a number of our C. elegans transcription factor annotations, I see that they are getting annotations to both:

'transcription, DNA-templated' GO:0006351 via KW-0804

and

'regulation of transcription, DNA-templated' GO:0006355 via KW-0805

@pgaudet and @sylvainpoux

It seems we would only want the latter mapping from KW-0805, right?

sylvainpoux commented 5 years ago

I will have to be very modest because I'm not a specialist in ontology. I however do not see the problem in having 'Transcription' KW associated with GO:0006351; transcription, DNA-templated and 'Transcription regulation' KW associated with GO:0006355; regulation of transcription, DNA-templated

If this cause any issue, I can ask George to remove one mapping

Thanks

Sylvain

ID Transcription. AC KW-0804 GO GO:0006351; transcription, DNA-templated

ID Transcription regulation. AC KW-0805 GO GO:0006355; regulation of transcription, DNA-templated

ValWood commented 5 years ago

The problem here goes away (globally) if we implement my proposal to filter/hide all non EXP (or at lease IEA,TAS,NAS) annotations with an existing EXP annotation, AND also less specific IEA....

pgaudet commented 5 years ago

Hi @sylvainpoux We really want to make the distinction between part_of a process and part_of_regulation of the process.

Transcription factors should only get 'GO:0006355; regulation of transcription, DNA-templated'.

Can this be changed ?

Thanks, Pascale

vanaukenk commented 5 years ago

@sylvainpoux @pgaudet Thanks for taking a look at this. As Pascale says, the issue is not with the KW2GO mappings themselves, but rather the association of TFs with both the transcription process and regulation of the transcription process. TFs should only be associated with regulation of transcription process.
@ValWood - redundancy and display is a distinct issue and yes, should also be addressed.

sylvainpoux commented 5 years ago

@pgaudet @vanaukenk Ok, thanks for the explanation. GOA is managing the mapping between Swiss-Prot and GO.

George (@ggeorghiou ), may I ask you to remove the mapping between GO GO:0006351; transcription, DNA-templated and the KW KW-0804 (Transcription)?

Thanks

Sylvain

ValWood commented 5 years ago

From memory, there are quite a lot of GO terms that have both KW mappings. Would it be possible to identify them all and fix appropriately? I don't have a list because I filtered them historically.

vanaukenk commented 5 years ago

@sylvainpoux - I think the actual mapping between the KW and the GO term is fine, and may be useful in some cases, but it's the application of both KWs 'transcription' and 'regulation of transcription' to a transcription factor that is the issue. See the keywords here for an example: https://www.uniprot.org/uniprot/O16425

sylvainpoux commented 5 years ago

We have a hierarchy of keywords and KW-0805 (Transcription regulation) is the child of KW-0804 (Transcription). For this reason, KW-0804 is always present in entries with KW-0805

ValWood commented 5 years ago

In general, to reduce the number of unnecessary IEA's (dramatically in the transcription case), it would be better to change the mechanism of application of Keywords so that only the most specific keyword is applied.

tonysawfordebi commented 5 years ago

@sylvainpoux George is on vacation this week, but I've gone ahead and removed the mapping for KW-0804.

ValWood commented 5 years ago

On this entry? or generally. Based on Sylvain's comment this will be a more general issue.

ValWood commented 5 years ago

...and you would not want to lose the more general mapping if KW-0804 wasn't present. (completing thought!)

tonysawfordebi commented 5 years ago

The mapping between KW-0804 and GO:0006351, as Sylvain requested.

vanaukenk commented 5 years ago

"We have a hierarchy of keywords and KW-0805 (Transcription regulation) is the child of KW-0804 (Transcription). For this reason, KW-0804 is always present in entries with KW-0805"

Thanks for the explanation @sylvainpoux I didn't realize that this is how KW-0804 also gets connected to entries with KW-0805 In cases like this, then, the KW hierarchy seems to behave differently than GO wrt transitivity of the regulates relation and that's really the issue. The actual KW mappings to GO BP terms are okay

pgaudet commented 5 years ago

@sylvainpoux this is indeed a problem - In GO it took us years to finally decide that you cannot be in a process and regulate it. Ideally the SPKW hierarchy would be the same; is this possible ?

Thanks, Pascale

selewis commented 5 years ago

@Pascale Gaudet pgaudet1@gmail.com If this is true "cannot be in a process and regulate it" (which I agree ought to be the case) then we have quite a lot of these. Lists can be easily put together for different processes, should I send these to you?

On Mon, Jul 30, 2018 at 12:44 AM pgaudet notifications@github.com wrote:

@sylvainpoux https://github.com/sylvainpoux this is indeed a problem - In GO it took us years to finally decide that you cannot be in a process and regulate it. Ideally the SPKW hierarchy would be the same; is this possible ?

Thanks, Pascale

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/geneontology/go-annotation/issues/2036#issuecomment-408776020, or mute the thread https://github.com/notifications/unsubscribe-auth/ABcuEGPZ6ol02X0F0aRCTwpNzcoycUweks5uLrlZgaJpZM4VguaD .

pgaudet commented 5 years ago

@selewis I was referring to Keywords. For GO annotations in general, there are enough exceptions that we cannot make this a rule. For keywords though, it seems general enough that regulation of x is not a x, that the hierarchy should be modified.

Thanks, Pascale

ValWood commented 5 years ago

There are also exceptions where gene products are clearly regulating a process and a part of it. I think autophagy and yeast conjugation are examples because researchers define the processes to include the signalling part.

pgaudet commented 5 years ago

@sylvainpoux : "The easiest solution is to remove the KW mappings to GO for KW-0804 and KW-0805"

pgaudet commented 5 years ago

@ggeorghiou ~Can you remove the mapping ?~

pgaudet commented 5 years ago

We need to 1) check that this has been corrected 2) Identify other problematic KW that are linked by a 'regulates' 3) Check whether SPKW mappings still provide information that we dont have from other sources (InterPro, PAINT)

ggeorghiou commented 5 years ago

KW-0804 has had it's mapping already deleted. I can do KW-0805 as well but do we have a replacement mapping for either keyword or are we ok with them having no mapping at this time @sylvainpoux ?

pgaudet commented 5 years ago

@ggeorghiou I think in fact KW-0805 is the one that should have been deleted, since it inherits its 'transcription parent' - did I get this the wrong way around ?

pgaudet commented 5 years ago

Action : @vanaukenk to somehow get the answer to Check whether SPKW mappings still provide information that we dont have from other sources (InterPro, PAINT)

ggeorghiou commented 5 years ago

Tony deleted KW-0804 mapping when I was on holiday, I can restore it if need be,