geneontology / minerva

BSD 3-Clause "New" or "Revised" License
6 stars 8 forks source link

missing ISO annotatins in noctua annotation preview #532

Open LiNiMGI opened 10 months ago

LiNiMGI commented 10 months ago

For MGI gene model Zfp750 (MGI:MGI:2442210)(model ID: gomodel:653b0ce600001157 ), I noticed that:

annotation preview missing the ISO annotations: Zap750 promoter-specific chromatin binding GO:1990841 ISO PMID:37115925 UniProtKB:Q32MQ0 Part of: regulation of transcription by RNA polymerase II GO:0006357 ISO PMID:37115925 UniProtKB:Q32MQ0

Though, these ISO annotations were included in the export GAPD(GAP).

balhoff commented 10 months ago

@kltm doesn't the annotation preview load the GPAD export and then prettify it? Would this point to a problem in the annotation preview workbench?

kltm commented 10 months ago

@balhoff could be. Looking at the wire, two things are requested of m3Batch: the GPAD output (6 lines) and an id/label map.

Around this section of the table building (v/trivial), there is an explicit filter set: https://github.com/geneontology/noctua/blob/master/workbenches/annpreview/AnnPreview.js#L241, limiting lines to 12 cols. I'd have to play a little to see if this is what is actually reducing lines (I'm having a little trouble simulating locally), but it seems like a likely candidate. That said, it is explicitly in there for the preview; @vanaukenk I don't suppose you have any recollection of this? Git blame and my notes are vague https://github.com/geneontology/noctua/issues/437 .

kltm commented 10 months ago

Reading through that little bit of history again, I'm wondering if we really need this workbench still? The only reason, really, we don't just go with the GPAD is the labels, and the table view should have that now, right?

kltm commented 10 months ago

@balhoff Okay, I think there might be an issue in the GPAD output we're parsing from:

!gpa-version: 1.1
MGI MGI:2442210 involved_in GO:0006357  PMID:37115925   ECO:0000266 UniProtKB:Q32MQ0
        20231108    MGI     noctua-model-id=gomodel:653b0ce600001157|model-state=production|contributor=https://orcid.org/0000-0002-9796-7693
MGI MGI:2442210 acts_upstream_of_or_within  GO:0044091  PMID:37115925   ECO:0000315 MGI:MGI:7470773     20231108    MGI occurs_in(GO:0001533)   noctua-model-id=gomodel:653b0ce600001157|model-state=production|contributor=https://orcid.org/0000-0002-9796-7693
MGI MGI:2442210 acts_upstream_of_or_within  GO:2000304  PMID:37115925   ECO:0000315 MGI:MGI:7470773     20231108    MGI occurs_in(GO:0001533)   noctua-model-id=gomodel:653b0ce600001157|model-state=production|contributor=https://orcid.org/0000-0002-9796-7693
MGI MGI:2442210 enables GO:1990841  PMID:37115925   ECO:0000266 UniProtKB:Q32MQ0
        20231108    MGI part_of(GO:0006357),has_input(MGI:MGI:1915050),has_input(MGI:MGI:1917309),has_input(MGI:MGI:1921809),has_input(MGI:MGI:1927578),has_input(MGI:MGI:2156528)  noctua-model-id=gomodel:653b0ce600001157|model-state=production|contributor=https://orcid.org/0000-0002-9796-7693
MGI MGI:2442210 acts_upstream_of_or_within  GO:0061436  PMID:37115925   ECO:0000315 MGI:MGI:7470773     20231108    MGI     noctua-model-id=gomodel:653b0ce600001157|model-state=production|contributor=https://orcid.org/0000-0002-9796-7693
MGI MGI:2442210 acts_upstream_of_or_within  GO:0010628  PMID:37115925   ECO:0000315 MGI:MGI:7470773     20231108    MGI     noctua-model-id=gomodel:653b0ce600001157|model-state=production|contributor=https://orcid.org/0000-0002-9796-7693

Dropping into whitespace mode, there seems to be an erroneous newline after UniProtKB:Q32MQ0 and UniProtKB:Q32MQ0; dropping those two bad lines could account for the filtering.

ukemi commented 10 months ago

Hi @kltm. I think that curators use the workbench as a human-friendly way of viewing the annotations that are coming from their models, at least I do. I have also told MGI curators to do a sanity check if they are making 'true' GO-CAMs. Although I still think there are issues with the annotations that are being generated by causal models, the curators should still check. It might be nice to see if other groups use the annotation preview. Maybe check on an annotation call?

balhoff commented 10 months ago

Dropping into whitespace mode, there seems to be an erroneous newline after UniProtKB:Q32MQ0 and UniProtKB:Q32MQ0; dropping those two bad lines could account for the filtering.

@kltm I see what you mean—weird! I will try to figure out where those newlines are coming from.

balhoff commented 10 months ago

@kltm if you look in the OWL export you can see that those newlines were somehow put into the string literals: http://noctua.geneontology.org/download/gomodel:653b0ce600001157/owl

balhoff commented 10 months ago

@ukemi or @LiNiMGI could you edit those two with values to remove the newlines? I think you have to add a whole new evidence; it didn't seem directly editable to me. Or else would it be better for me to make an edit directly in the OWL during this week's Noctua downtime?

ukemi commented 10 months ago

@balhoff I'll do it.

ukemi commented 10 months ago

Done.

vanaukenk commented 10 months ago

Looks like that fixed it?

kltm commented 10 months ago

@vanaukenk Likely fixed, but maybe we should open a companion issue that no form action sends leading or trailing whitespace?