pombase / website

PomBase website v2
MIT License
6 stars 1 forks source link

Redundancy in extension default display (modifications) #1627

Closed ValWood closed 1 week ago

ValWood commented 4 years ago

I think we could improve the display here quite a lot very easily The red strikeouts seem to be completely redundant with each other The green strikeouts are curation improvements I will look into which will collapse further. sty1

https://www.pombase.org/gene/SPAC24B11.06c

https://docs.google.com/presentation/d/1wEa9FSDwpEgFDpncE-7cxDoTQqKb3EO-n7y6owyOwy0/edit?usp=sharing

ValWood commented 4 years ago

It seems that we don't implement the same rules for modifications that we implement elsewhere?

ValWood commented 4 years ago

Although I thought we had???

mah11 commented 4 years ago

It is the same as for other ontologies -- in that ontology paths are only followed for the annotated term, not (yet) for terms used in extensions. The annotations in the screenshot are all to the same term, MOD:00048.

ValWood commented 4 years ago

Ah OK I thought exact match extensions (or presence/absence) were collapsed, but that different specificities of extensions were not understood?

ValWood commented 4 years ago

for example (ase1)

GO:0005515 protein binding 1037
binds peg1 at mitotic spindle midzone IPI Bratman SV et al. (2007)
binds klp9 during mitotic anaphase IPI Fu C et al. (2009)
binds klp9 part of microtubule sliding involved in mitotic spindle elongation active form anaphase spindle elongation protein 1 unmodified form during mitotic anaphase IPI Fu C et al. (2009)

collapses to

protein binding   | binds peg1 at mitotic spindle midzone   | binds klp9 part of microtubule sliding involved in mitotic spindle elongation during mitotic anaphase

in the default view binds klp9 during mitotic anaphase | IPI | Fu C et al. (2009) is hidden?

mah11 commented 4 years ago

a lot of extensions are getting collapsed for the sty1 MOD:00048 annotation ... the only one that looks like it should be hidden but isn't is the plain "added by wis1"

ValWood commented 4 years ago

why not also removed by pyp1 redundant with modified residue Y173 removed by pyp1

I was thinking modified residue Y173 added during cellular response to salt stress and modified residue Y173 present during cellular response to salt stress

too but that would need to subsume present_during into added_during (I hope added during is a descendant of present _during)?

I thought the extension filtering could be low priority, but it would be nice as we start doing more modelling of modifications and substrates to present a more concise view.

mah11 commented 4 years ago

I hope added during is a descendant of present_during

At present there's no structure to the modification extension relations; it's basically a flat list even though we have it in obo format.

mah11 commented 4 years ago

oh yeah, I missed the 'removed by pyp1' thing

ValWood commented 4 years ago

I will get more collapsing by seeing if I can standardize the annotation a bit. GO has a zillion ways to describe the osmolarity response...

kimrutherford commented 4 years ago

I think we could improve the display here quite a lot very easily

Could you add a link to that page?

kimrutherford commented 4 years ago

It seems that we don't implement the same rules for modifications that we implement elsewhere?

It should be the same for all ontologies.

ValWood commented 4 years ago

sorry I meant to do that: https://www.pombase.org/gene/SPAC24B11.06c

kimrutherford commented 4 years ago

sorry I meant to do that: https://www.pombase.org/gene/SPAC24B11.06c

Thanks. I'm surprised the redundancy removal code has this bug as it's been in use for a year or two now and it seemed to work well. It's too complicated to look at on a Friday night. I'll have a look on Monday.

ValWood commented 4 years ago

There is no hurry ;) Nothing is presented incorrectly. I wouldn't really call it a bug.

kimrutherford commented 4 years ago

I wouldn't really call it a bug.

I'm just a bit concerned that this is a symptom and there are deeper problems. I'll have a look next week.

kimrutherford commented 4 years ago

I'm just a bit concerned that this is a symptom and there are deeper problems.

I've dug into this now. It was happening because the "modified residue" isn't really part of the extension. It gets inserted as a fake extension part for display purposes but was added too late in the process. That was causing the summarisation code to mess up. It's fixed I think and we'll see the results on Monday morning. I'll check the main site then.

kimrutherford commented 4 years ago

I think it's fixed. It's quite complex code though so I can't guarantee that there aren't other issues.

Let me know if you see anything else.

ValWood commented 4 years ago

on

O4'-phospho-L-tyrosine

modified residue Y173 removed by pyp1 should be hidden by modified residue Y173 added by wis1 added during cellular response to salt stress removed by pyp1

kimrutherford commented 4 years ago

should be hidden by

This has turned into a can of worms. :-)

I can't fix this problem without causing different bugs. I'm going to leave this for a while and possibly do a rewrite to fix this and to remove redundancy using the hierarchy of terms in extensions: https://github.com/pombase/website/issues/1620#issuecomment-708154147 https://github.com/pombase/website/issues/1619#issuecomment-708005613

ValWood commented 4 years ago

No worries, no hurry becasue it isn't incorrect as it is.

. Better to do it all together, no point in tinkering.

ValWood commented 3 years ago

related https://github.com/pombase/pombase-chado/issues/747

ValWood commented 3 years ago

here is an example on process:

Mal3

mal3, collapse extensions

I'll add a few different types for checking.

ValWood commented 3 years ago

I noticed these today on the Cdc2 page, I thought these would be a single set?

Screenshot 2021-06-30 at 11 54 15
kimrutherford commented 3 years ago

I noticed these today on the Cdc2 page, I thought these would be a single set?

It looks like in this summary row the "phosphorylates" is a "has_input" relation:

phosphorylates bit61, cdc15, cdc23, crf1, csi1, dbl7, dfp1, dnt1, drc1, fkh2, ies4, iss10, lem2, mdb1, mis4, msh6, mvp1, orc1, rad2 during mitotic G2 phase

And in this row, the "phosphorylates" is a "has_direct_input":

phosphorylates rum1 during mitotic G2 phase

I thought I had fixed this problem: https://github.com/pombase/website/issues/1701#issuecomment-842108424 but must have got it wrong. I'll fix it.

kimrutherford commented 3 years ago

I noticed these today on the Cdc2 page, I thought these would be a single set?

I've fixed that so now they are all in one row of the summary:

phosphorylates bit61 , cdc15 , cdc23 , crf1 , csi1 , dbl7 , dfp1 , dnt1 , drc1 , fkh2 , ies4 , iss10 , lem2 , mdb1 , mis4 , msh6 , mvp1 , orc1 , rad2 , rum1 , sec16 , sgo2 , sld3 , snt1 , ulp2 during mitotic G2 phase

ValWood commented 3 years ago

much better!

ValWood commented 3 years ago

pdb1

I think this one is because the modified forms are not handled by the gene name:

GO_query_parrentage
kimrutherford commented 3 years ago

Should it be this instead?:

   binds hhf1/Me3:(K20), hhf2/Me2:(K20), hhf3/Me3:(K20)
ValWood commented 3 years ago

comment on pdb1 methylated histone binding is a parent of H4K20me3 modified histone binding. I think we need to explicitly add the target gene name in addition to the modified form. Although we would like this to be 'inferred' . So actioning this ticket https://github.com/pombase/website/issues/1710 would resolve this particular extension problem.

kimrutherford commented 2 months ago

Hi Val.

How much redundancy is there in the current display? It's changes a bit since you created this issue?

https://www.pombase.org/gene/SPAC24B11.06c

image

ValWood commented 2 months ago

That one looks fine. I closed off everything that is no longer an issue. The "methylated histone binding" eg is still current

ValWood commented 1 week ago

https://www.pombase.org/gene/SPAC24B11.06c looks fine. I think this ticket can close?

kimrutherford commented 1 week ago

I think this ticket can close?

If you're happy with it, I'm happy. :-)

ValWood commented 1 week ago

Yep, if there is anything else it is very minor and we can open a new ticket