Open ubhuiyan opened 2 weeks ago
@kmartinez834 --- passing reported glycosylation at the composition level to glytoucan level seems to be problematic to me? There are many glytoucans that have the same composition and this propagation of information is problematic.
We have similar data in unreviewed/human_proteoform_glycosylation_sites_unicarbkb.csv where saccharide="" and composition has value, and I don't think we are passing this glycosylation record to any glytoucan. Please look into this carefully.
$ cat unreviewed/human_proteoform_glycosylation_sites_unicarbkb.csv | awk -F"\",\"" '{print $4","$18}' |grep ^, |sort -u |head
,
,comp_HexNAc0Hex0dHex0NeuAc0NeuGc0Pent0S0P0KDN0HexA0
,comp_HexNAc1HexdHex0NeuAc1NeuGc0Pent0S0P0KDN0HexA0
,comp_HexNAc2Hex3dHex0NeuAc0Gc0Pent0S0P0KDN0HexA0
,comp_HexNAc2Hex3dHex1NeuAc0Gc0Pent0S0P0KDN0HexA0
,comp_HexNAc2Hex4dHex1NeuAc0Gc0Pent0S0P0KDN0HexA0
,comp_HexNAc2Hex5dHex0NeuAc0Gc0Pent0S0P0KDN0HexA0
,comp_HexNAc2Hex5dHex1NeuAc0Gc0Pent0S0P0KDN0HexA0
,comp_HexNAc2Hex6dHex0NeuAc0Gc0Pent0S0P0KDN0HexA0
,comp_HexNAc2Hex7dHex0NeuAc0Gc0Pent0S0P0KDN0HexA0
@rykahsay I can confirm that Glytoucan to Byonic name is 1:1 in names.tsv
Update the "src_xref_key" and "src_xref_id" values for all rows:
human_proteoform_glycosylation_sites_embl.csv:
"src_xref_key","src_xref_id"
"protein_xref_glygen_ds","GLY_000888"
mouse_proteoform_glycosylation_sites_embl.csv:
"src_xref_key","src_xref_id"
"protein_xref_glygen_ds","GLY_000889"
done, check the datasets for now and the effect will propagate after I make and push json objects to tst
Source = glygen_upload.csv Output = *_proteoform_glycosylation_sites_embl.csv
Mapping Files: unreviewed/human_protein_masterlist.csv and unreviewed/mouse_protein_masterlist.csv misc/n_sequon_info.csv unreviewed/*_protein_glycosylation_motifs.csv glytoucan/current/export/names.tsv
Output Files: human_proteoform_glycosylation_sites_embl.csv mouse_proteoform_glycosylation_sites_embl.csv
(or gene_name if uniprotkb_ac can't be mapped)
Ex. HexNAc(2)Hex(8) % 1702.5814
Example:
Input File:
Output File: