pombase / pombase-chado

PomBase code for accessing Chado
MIT License
5 stars 3 forks source link

PTM processing from Uniprot #1171

Closed ValWood closed 1 month ago

ValWood commented 6 months ago

I think we got this data in the Intermine grab, but did not use it. I remember these features being there

Some of these will go on the modification track (like glycosylation) So it would be useful to get a full list of the feature types first to decide what to display and on which track

and how to provide provenance

e.g. https://www.uniprot.org/uniprotkb/Q09175/entry#ptm_processing

kimrutherford commented 4 months ago

I think we got this data in the Intermine grab, but did not use it.

Can you remind what that is? I don't remember getting any UniProt data.

ValWood commented 4 months ago

Added to tuesday list

kimrutherford commented 3 months ago

I had a look at what's available from the UniProt API. We can get the coordinates for these:

It wouldn't take long to add any/all of these.

ValWood commented 3 months ago

glycosylation, add as modification, "glycosylated residue" (MOD:00693) but only if we don't have it from the same source.

Disulphide bond, add as "disulfide crosslinked residues" (MOD:00689) if we don't have it from the same source (we don't have many of these curated)

"Propeptide" and "chain" can be added as features

Thanks!

kimrutherford commented 3 months ago

glycosylation, add as modification, "glycosylated residue" (MOD:00693) but only if we don't have it from the same source.

Disulphide bond, add as "disulfide crosslinked residues" (MOD:00689) if we don't have it from the same source (we don't have many of these curated)

Ah, OK. I hadn't thought about those sort of cases. That will take a bit longer because they'll need to be included in Chado and annotation. The data for the other new tracks we get from UniProt isn't going into Chado, it's just displayed in the protein feature viewer.

"Propeptide" and "chain" can be added as features

I can do those quickly.

kimrutherford commented 3 months ago

"Propeptide" and "chain" can be added as features

That's done for tomorrow.

https://desktop.kmr.nz/gene_protein_features/SPAC19G12.10c

image

kimrutherford commented 3 months ago

"Propeptide" and "chain" can be added as features

That's done for tomorrow.

Yet again I forgot to push my changes. I'll check again on Friday.

kimrutherford commented 3 months ago

"Propeptide" and "chain" can be added as features

That's on the main site now. Example: https://www.pombase.org/gene_protein_features/SPAC22E12.09c

kimrutherford commented 3 months ago

I've added Glycosylation and Disulfide bond from the UniProt data to the protein feature viewer. It was easier than adding SO features to Chado (which I'll do soon) and the file parsing code is needed for displaying the features and adding them to Chado.

This will be on pombase.org on Saturday morning. Unless I forgot to commit a change again.

https://desktop.kmr.nz/gene_protein_features/SPBC342.03

Are the track labels OK?

image

kimrutherford commented 3 months ago

This will be on pombase.org on Saturday morning.

Looks OK. Let me know if anything needs rewording:

https://www.pombase.org/gene_protein_features/SPBC342.03

ValWood commented 3 months ago

Existing glycosylation sites are shown in the modification section as "pink dots" so we should do the same with these imported ones

Screenshot 2024-08-10 at 09 20 07
ValWood commented 3 months ago

We probably also need a way to let the user know how to find the details of the modifications on the gene page etc

ValWood commented 3 months ago

crosslinks are a special case, and we have not curated many of them. https://www.pombase.org/term/MOD:00033

I agree these should go on a separate track which shows the connections between them. So far we have only one with residues, and we only display this as individual residues

(to discuss on next weeks call)

ValWood commented 3 months ago

glycosylation sites for example, here: https://www.pombase.org/gene_protein_features/SPAC22E12.09c UniProt doesn't add anything, there are fewer sites, and the term is less specific and they aren't experimentally sourced

ValWood commented 3 months ago

Maybe we keep them both (I see we describe modifications as 'curated') but if so we should put the tracks next to each other and use the same symbols

kimrutherford commented 3 months ago

Existing glycosylation sites are shown in the modification section as "pink dots" so we should do the same with these imported ones

I've changed the configuration so that should be fixed in the morning.

we should put the tracks next to each other and use the same symbols

I've done that too.

ValWood commented 3 months ago

fab. eventually we want. to merge these onto the same track, I think...

kimrutherford commented 3 months ago

eventually we want. to merge these onto the same track, I think...

That will help. I've made a change to put the glycosylation sites track directly under the modifications track to group them together. That will be visible tomorrow:

image

ValWood commented 3 months ago

Change mouse over to be more explicit

MODIFICATIONS

GLYCOSYLATION SITE - mouse over Inferred glycosylation sites imported from uniprot (these should also appear in the modification section of the gene page)

Check if disulphide cross-links are displayed and display like this : https://www.uniprot.org/uniprotkb/Q01663/entry#ptm_processing

kimrutherford commented 3 months ago

(these should also appear in the modification section of the gene page)

So we should make PSI-MOD annotations for the glycosylation sites from UniProt? In that case we we'll be able to remove the glycosylation sites track.

ValWood commented 3 months ago

So we should make PSI-MOD annotations for the glycosylation sites from UniProt? In that case we'll be able to remove the glycosylation sites track.

That's true but we need to make sure we are clear about the source and the evidence code. Maybe we should display the evidence and reference? MAybe they can still b e classed as manically curated- I'll ask Antonia about that

kimrutherford commented 3 months ago

GLYCOSYLATION SITE - mouse over Inferred glycosylation sites imported from uniprot

I've added "Inferred" to the mouse overs.

Check if disulphide cross-links are displayed

They now look like: image

The changes will be on pombase.org soon.

ValWood commented 3 months ago

They now look like: the look great!

kimrutherford commented 3 months ago

The changes will be on pombase.org soon.

I forgot that it will need an over night load for those changes. So it will be on pombase.org in the morning.

kimrutherford commented 3 months ago

I forgot that it will need an over night load for those changes. So it will be on pombase.org in the morning.

The improved disulphide cross-links display is now on pombase.org

I think this issue is done now (and it's long) so I'll close it and move the remaining task to a new issue:

kimrutherford commented 1 month ago

Check if disulphide cross-links are displayed

They now look like: image

While working on pombase/website#2203 I noticed a display option that allows us to display the disulphide cross-links like this:

image

https://www.pombase.org/gene/SPBC342.03

I think it looks better because it matches the modifications.

Do you think it's an improvement?

ValWood commented 1 month ago

Yes, that looks great!

kimrutherford commented 1 month ago

OK, I won't revert the change then. :-)

kimrutherford commented 1 week ago

I noticed that I had misconfigured the modification loading. Now fixed so get one extra modification annotation from UniProt: https://www.pombase.org/term/MOD:00793