pombase / pombase-chado

PomBase code for accessing Chado
MIT License
5 stars 3 forks source link

PTM processing from Uniprot #1171

Closed ValWood closed 1 month ago

ValWood commented 4 months ago

I think we got this data in the Intermine grab, but did not use it. I remember these features being there

Some of these will go on the modification track (like glycosylation) So it would be useful to get a full list of the feature types first to decide what to display and on which track

and how to provide provenance

e.g. https://www.uniprot.org/uniprotkb/Q09175/entry#ptm_processing

kimrutherford commented 2 months ago

I think we got this data in the Intermine grab, but did not use it.

Can you remind what that is? I don't remember getting any UniProt data.

ValWood commented 2 months ago

Added to tuesday list

kimrutherford commented 1 month ago

I had a look at what's available from the UniProt API. We can get the coordinates for these:

It wouldn't take long to add any/all of these.

ValWood commented 1 month ago

glycosylation, add as modification, "glycosylated residue" (MOD:00693) but only if we don't have it from the same source.

Disulphide bond, add as "disulfide crosslinked residues" (MOD:00689) if we don't have it from the same source (we don't have many of these curated)

"Propeptide" and "chain" can be added as features

Thanks!

kimrutherford commented 1 month ago

glycosylation, add as modification, "glycosylated residue" (MOD:00693) but only if we don't have it from the same source.

Disulphide bond, add as "disulfide crosslinked residues" (MOD:00689) if we don't have it from the same source (we don't have many of these curated)

Ah, OK. I hadn't thought about those sort of cases. That will take a bit longer because they'll need to be included in Chado and annotation. The data for the other new tracks we get from UniProt isn't going into Chado, it's just displayed in the protein feature viewer.

"Propeptide" and "chain" can be added as features

I can do those quickly.

kimrutherford commented 1 month ago

"Propeptide" and "chain" can be added as features

That's done for tomorrow.

https://desktop.kmr.nz/gene_protein_features/SPAC19G12.10c

image

kimrutherford commented 1 month ago

"Propeptide" and "chain" can be added as features

That's done for tomorrow.

Yet again I forgot to push my changes. I'll check again on Friday.

kimrutherford commented 1 month ago

"Propeptide" and "chain" can be added as features

That's on the main site now. Example: https://www.pombase.org/gene_protein_features/SPAC22E12.09c

kimrutherford commented 1 month ago

I've added Glycosylation and Disulfide bond from the UniProt data to the protein feature viewer. It was easier than adding SO features to Chado (which I'll do soon) and the file parsing code is needed for displaying the features and adding them to Chado.

This will be on pombase.org on Saturday morning. Unless I forgot to commit a change again.

https://desktop.kmr.nz/gene_protein_features/SPBC342.03

Are the track labels OK?

image

kimrutherford commented 1 month ago

This will be on pombase.org on Saturday morning.

Looks OK. Let me know if anything needs rewording:

https://www.pombase.org/gene_protein_features/SPBC342.03

ValWood commented 1 month ago

Existing glycosylation sites are shown in the modification section as "pink dots" so we should do the same with these imported ones

Screenshot 2024-08-10 at 09 20 07
ValWood commented 1 month ago

We probably also need a way to let the user know how to find the details of the modifications on the gene page etc

ValWood commented 1 month ago

crosslinks are a special case, and we have not curated many of them. https://www.pombase.org/term/MOD:00033

I agree these should go on a separate track which shows the connections between them. So far we have only one with residues, and we only display this as individual residues

(to discuss on next weeks call)

ValWood commented 1 month ago

glycosylation sites for example, here: https://www.pombase.org/gene_protein_features/SPAC22E12.09c UniProt doesn't add anything, there are fewer sites, and the term is less specific and they aren't experimentally sourced

ValWood commented 1 month ago

Maybe we keep them both (I see we describe modifications as 'curated') but if so we should put the tracks next to each other and use the same symbols

kimrutherford commented 1 month ago

Existing glycosylation sites are shown in the modification section as "pink dots" so we should do the same with these imported ones

I've changed the configuration so that should be fixed in the morning.

we should put the tracks next to each other and use the same symbols

I've done that too.

ValWood commented 1 month ago

fab. eventually we want. to merge these onto the same track, I think...

kimrutherford commented 1 month ago

eventually we want. to merge these onto the same track, I think...

That will help. I've made a change to put the glycosylation sites track directly under the modifications track to group them together. That will be visible tomorrow:

image

ValWood commented 1 month ago

Change mouse over to be more explicit

MODIFICATIONS

GLYCOSYLATION SITE - mouse over Inferred glycosylation sites imported from uniprot (these should also appear in the modification section of the gene page)

Check if disulphide cross-links are displayed and display like this : https://www.uniprot.org/uniprotkb/Q01663/entry#ptm_processing

kimrutherford commented 1 month ago

(these should also appear in the modification section of the gene page)

So we should make PSI-MOD annotations for the glycosylation sites from UniProt? In that case we we'll be able to remove the glycosylation sites track.

ValWood commented 1 month ago

So we should make PSI-MOD annotations for the glycosylation sites from UniProt? In that case we'll be able to remove the glycosylation sites track.

That's true but we need to make sure we are clear about the source and the evidence code. Maybe we should display the evidence and reference? MAybe they can still b e classed as manically curated- I'll ask Antonia about that

kimrutherford commented 1 month ago

GLYCOSYLATION SITE - mouse over Inferred glycosylation sites imported from uniprot

I've added "Inferred" to the mouse overs.

Check if disulphide cross-links are displayed

They now look like: image

The changes will be on pombase.org soon.

ValWood commented 1 month ago

They now look like: the look great!

kimrutherford commented 1 month ago

The changes will be on pombase.org soon.

I forgot that it will need an over night load for those changes. So it will be on pombase.org in the morning.

kimrutherford commented 1 month ago

I forgot that it will need an over night load for those changes. So it will be on pombase.org in the morning.

The improved disulphide cross-links display is now on pombase.org

I think this issue is done now (and it's long) so I'll close it and move the remaining task to a new issue: