glygener / glygen-issues

Repository for public GlyGen tickets
GNU General Public License v3.0
0 stars 0 forks source link

ABO enzyme in the Sandbox #1381

Open katewarner opened 6 months ago

katewarner commented 6 months ago

I'm looking at the enzyme ABO (https://api.glygen.org/protein/detail/P16442-1). In UniProt and GlycoEnzDB, it says that ABO catalyses very specific reactions, adding a GalNac or Gal to a Gal-Fuc:

image

There doesn't seem to be any evidence that it catalyses any other glycosyltransferase reaction but in GlyGen it appears to have a long list of synthesized glycans (907 glycans), and in many of them it's listed as an enzyme that adds a GalNac or Gal to a monosaccharide that is not a gal-fuc. For some of these glycans (e.g. G00995HW, G99853JR, G99738HT), Karina and I noticed that ABO doesn't appear as a enzyme in the Sandbox or the Glycan Feature viewer section of the Glycan detail page (e.g. https://www.tst.glygen.org/glycan/G00995HW#Feature-View), but it is listed in the Biosynthetic enzyme section (e.g. https://tst.glygen.org/glycan/G00995HW#Biosynthetic-Enzymes).

According to Karina, the Biosynthetic enzyme section comes from the Sandbox API, so we were wondering if the Sandbox API has, or needs, to be updated since the mapping of this enzyme to these glycans doesn't seem to be correct.

edwardsnj commented 6 months ago

Your analysis of the ABO gene's activity matches what I understand - the sandbox user interface, JSON API, and the feature view all reflect this understanding. This represents an application of the "rules" infrastructure of the sandbox. I'm not sure what exactly what form the data-file being processed at GW is (not available to me at data.glygen.org) so I can't check whether the issue is in this input file or the processing script. The JSON document for the page, see: https://sandbox.glyomics.org/api/glycan-v5.php/G00995HW shows that ABO would be a rule violation and the residue does not specify ABO as an enzyme associated with it. I'm going to see if I can figure out the web-service API call that is being used so I can check it there too.

katewarner commented 6 months ago

Thank you that's great! If it helps, Karina said the file we pull from the api is located here: /data/projects/glygen/downloads/sandbox/current/glycotree_annotated_glycans.tsv. Urnisha would probably have a better idea how this file is generated, so I can add her to the ticket as she deals with the glycans.

edwardsnj commented 6 months ago

OK, yes, I can confirm that the https://sandbox.glyomics.org/api/glygenData.php web-service implemented in the Sandbox for GlyGen does not pay any attention to the "rules" infrastructure of the sandbox. It was probably established by Will before the rules concept was widely applied. I will implement a script to pre-compute a dump of such a table that considers the rules (which describe these types of exceptions). Please put it on the agenda for tomorrow morning so we can discuss.

ReneRanzinger commented 6 months ago

We will need to discuss this at the F2F

edwardsnj commented 6 months ago

I have checked in the file glycotree_annotated_glycans.tsv.gz into the sandbox repository. You should use this datafile rather than the webservice to pull down the relationships between glytoucan accessions and enzymes for glygen (assuming you want a rules sensitive version of the information).