geneontology / go-ontology

Source ontology files for the Gene Ontology
http://geneontology.org/page/download-ontology
Creative Commons Attribution 4.0 International
220 stars 40 forks source link

part_of relation missing from GO basic #19954

Open ValWood opened 4 years ago

ValWood commented 4 years ago

part_of relation missing between SNAP receptor and membrane fusion.

https://user-images.githubusercontent.com/7359272/92286477-15eac500-eeff-11ea-9c85-c509f13a23a2.png

https://github.com/pombase/pombase-chado/issues/791

pgaudet commented 4 years ago

@kltm @dougli1sqrd @balhoff Can you have a look ? This is weird, I see the relation in protégé, but not in today's snapshot go-basic.obo

Thanks, Pascale

ValWood commented 4 years ago

@kimrutherford for info

balhoff commented 4 years ago

I think relations that cross GO aspects/namespaces are specifically filtered out of go-basic. There is a Perl script for this: https://github.com/geneontology/go-ontology/blob/af1c8e3e48dec31842814b975209759f10485898/src/util/filter-obo-for-standard-release.pl#L69-L73

This one is a part_of between a MF and a BP.

pgaudet commented 4 years ago

@ValWood

go-basic does not contain relations that go across aspects, see http://geneontology.org/docs/download-ontology/

I dont think we want to change this.

Thanks, Pascale

ValWood commented 4 years ago

Just so I can explain to Kim - why is that?

We use GO basic, but this means that we will miss inferences. Should GO basic not be used? I imagined GO would want to keep inferences consistent whichever flavour of the ontology is used? Otherwise tools and databases would report annotation numbers differently depending whether they use GO full or GO basic.

What should GO basic be used for? Does it have a special purpose (if it does not contain inter-ontology relationships it should come with a big disclaimer).

@cmungall

ValWood commented 4 years ago

Thinking about it some more we don't need these inter ontology relationships in GO basic.

BUT, the annotations should be generated by GO, and we (POmBase) pick them up and incorporate them into our annotation corpus. So, we seem to be missing the annotations to "membrane fusion" that should be inferred via the inter-ontology link?

So I need to figure out if we are not including these annotations, or if some annotations are not being generated, or if there is a lag. v

ValWood commented 4 years ago
  1. the relation is not new GO:0005484 SNAP receptor activity 2015-01-12 | Added | RELATION | part of GO:0061025 (membrane fusion)

  2. Are the MF-BP annotations only generated if there is not an existing annotation? If so, we may not be seeing these because there are PAINT annotations and we are not yet importing PAINT. (we are waiting until these fixes are through https://github.com/geneontology/go-annotation/issues?q=is%3Aissue+is%3Aopen+label%3A%22PAINT+annotation%22)

I suspect that might be the case, if so we can close this ticket now because we will get the inferences soon.

pgaudet commented 4 years ago

Are the MF-BP annotations only generated if there is not an existing annotation?

I think so. @cmungall @kltm can you confirm ?

ValWood commented 4 years ago

Hi @cmungall @kltm

can you confirm that an inferred annotation between MF and BP will only be made if there is no existing annotation? If so I can close this..

cmungall commented 4 years ago

Just so I can explain to Kim - why is that?

Historic. There was historically there were lots of tools that baked in assumptions about no inter-ontology links.

I think we should revisit this. We can give people the script.

ValWood commented 4 years ago

Hi @cmungall
It's OK for us that the inter ontology links are absent from GO basic, because the annotations are provided so we will get them anyway. I think the only reason we don't have the annotation I was looking at is because we are not importing PAINT right now (we are waiting on some annotation fixes). So, if something has a PAINT (or any annotation from another source) you don't create an annotation by a MF-BO link, is that correct?

ValWood commented 3 years ago

We are now importing PAINT and still missing these annotation.

I can't figure out why.

13 | membrane fusion (GO:0061025) AND SNAP receptor activity (GO:0005484) [details] | ...   | 34 | membrane fusion (GO:0061025) | ...   | 19 | SNAP receptor activity (GO:0005484)

The intersect should be 19 based on the ontology

ValWood commented 3 years ago

Based on the ontology, anything annotated to "SNAP receptor activity" should get an inferred annotation to "membrane fusion"

https://www.ebi.ac.uk/QuickGO/GTerm?id=GO:0005484

Is the inferred annotation from F-P links up and running?

pgaudet commented 3 years ago

As thought the interontology links were used to make annotations that go in the 'prediction' files, for example for pombase: http://release.geneontology.org/2020-12-08/products/annotations/pombase-prediction.gaf

But I dont see SPCC594.06c (one of the pombase genes annotated with SNAP receptor activity, see http://amigo.geneontology.org/amigo/gene_product/PomBase:SPCC594.06c)

Not sure what's going on

@kltm Do you know how the pipeline handle those ?

Thanks, Pascale

ValWood commented 3 years ago

The SNAP receptor activity for SPCC594.06c is ISO. Perhaps the F-P inferences ignores this evidence code? but it should be evidence code agnostic, if the annotation is true, the inference is true whatever the evidence code...

ValWood commented 3 years ago

This old ticket documents some other instances of the same problems. Quite often, when I expect that we did not need to make a process annotation because it should be inferred via an F-P link, the expected annotation is missing from the inferred GAF. This makes it difficult to know whether we need to make the gap filling annotations or not...

see also geneontology/go-site#2226 geneontology/go-annotation#1427

pgaudet commented 3 years ago

Maybe we can set up a call also with @kimrutherford and @balhoff to look at that ? I dont understand where the issue is coming from.

ValWood commented 3 years ago

I don't think it is an ontology issue. It does not matter that the links are not in GO basic. I really only want to find out why the F-P inferences I expected are missing.

The question is really why do the MF annotations to "SNAP receptor activity" not result in a GOC annotation to "membrane fusion"

@cmungall is probably the best person to answer this. Once I know the reason they are missing I can figure out if it a problem with the inference pipeline that will be fixed, or If I need to fill int he annotation gaps manually. I'm happy to explain the issue on a call sometime if this does not make sense. (@cmungall you only need to consider the comments since 25th Sept - this is where I realise the problem was not what I thought it was originally)

pgaudet commented 3 years ago

I think @dougli1sqrd was working on the script that generated those?

ValWood commented 3 years ago

@cmungall @dougli1sqrd

@pgaudet want to revisit this ticket. It is still an problem for us.

Summary There is a MF-BP part of link between SNAP receptor activity and membrane fusion:

Screenshot 2021-04-16 at 15 45 50

So, I expected the GO Predictions would include inferred mapping to "membrane fusion" for all SNAP receptor activity annotations.

However you can see that this is not the case:

Screenshot 2021-04-16 at 15 59 43

https://www.pombase.org/results/from/id/cd19d97f-3b4b-43b1-8145-aa10a5803f4b Note that we import these inferences from GOA (I think?).

ValWood commented 3 years ago

@cmungall
@pgaudet understand the issue. We just need to know if the file is correctly generated and if we need to import it separately.

@kimrutherford do we expect the inferred annotations are in GOA, or do we pick up any of these from GO? It might be that we are not getting them because they are not in GOA. (although we suspect that not all of the annotation expected are generated)

ValWood commented 3 years ago

I spoke with @kimrutherford and we do not get the inferred annotations from GOA we pick them up from http://snapshot.geneontology.org/products/annotations/pombase-prediction.gaf which I believe is the correct file.

This means that there are missing inferences. I suspect that, because the only inferences for "membrane fusion" are IDA, that maybe the pipeline only generates the inferences for experimental annotation. If the rule is correct it should presumably be applied to all annotation regardless of evidence code?

One of the reasons that these annotations cause problems for curators is that it is really difficult to identify the source. Especially as they use the same evidence code as the originating annotation. Curators have complained about this a lot over many years. I would love to see a different evidence code used for MP->BP inferences (IMF - inferred from molecular function?) so that we know immediately how these were derived.

ValWood commented 3 years ago

@pgaudet Did this ever get looked into? We are still not picking up the expected inferences which is a shame...could you bring it up on a future editors call?

pgaudet commented 3 years ago

Hi @ValWood

Chatting with @cmungall @kltm - this code is older, unsupported code that actually needs to be replaced, so we need to prioritize this work with the new code (ontobio) that will be taking care of this. This is underway but needs to be completed.

Relevant tickets: https://github.com/geneontology/pipeline/issues/221 https://github.com/geneontology/pipeline/issues/130

Sorry about that !

Pascale

ValWood commented 3 years ago

No problem. Happy to test the new output when it is ready.

ValWood commented 2 years ago

What is the ETA for this? (just checking, but if it will be a really long time I might make the annotations manually). I opened the ticket in Sept 2020 ;)

ValWood commented 1 year ago

Can we discuss on an editors call in the New Year?