Open ReneRanzinger opened 1 year ago
I think we were expecting a table from Sriram's PhD student
@ubhuiyan Download https://github.com/neel-lab/GlycoEnzOnto/blob/main/finishedGlycogenes.xlsx from the GitHub repository
Copy the UniProt ACs from the Uniprot Field and add it to UniProt List In UniProt. Customize the table with column shown below.
Filter out Glycosytransferase GT (EC - 2.4.1.X) and compare it with https://data.glygen.org/ln2data/releases/data/v-2.3.1/reviewed/human_protein_glycosyltransferase.csv
Provide a list of UniProt ACs that are new and missing from the Glycogenes list.
Additional Notes: Add xref to Glycoenzoonto
Curation Steps:
1. Copy all UniProt accessions within the finishedGlycogenes dataset and paste to list search in UniprotKB. 2. Select "Swiss-Prot" for accessions that have been reviewed 3. Select "Customize Columns" and click the following: - From - Entry - Organism - Gene Name - Protein Name - EC Number - BRENDA cross reference - CAZy cross reference 4. Download the dataset as a CSV 5. Send to Jeet for checkpoint/QC 6. Compare UniProt accessions between the downloaded dataset and human_protein_glycosyltransferase to identify unique accessions within downloaded dataset. 7. Create a column to indicate the unique UniProt accessions in the downloaded dataset 8. Send to Jeet for review
This task has been completed. I emailed Jeet this comparison soon after joining the team.
@jeet-vora is there something we need to talk about? Anything we can do or is it a dead end in terms of data integration or linking?
@ubhuiyan
Can you include this item in our morning meeting to discuss. I went to vacation after you might have sent so need to review it again.
@katewarner
The dataset https://data.glygen.org/GLY_000922 has not been processed correctly. There are repeating acessions with differenet genenames. The source file has 403 rows but the processed one has 403 + rows. Also some of the headers from the source file are missing. Can you investigate and create a ticket for Robel.
Source dataset : https://github.com/neel-lab/GlycoEnzOnto/blob/main/finishedGlycogenes.xlsx from the GitHub repository
Check the attached dataset for new GTs. Once verified add them to our human_protein_glycosyltransferase dataset. The new GTs should show evidence and the entry needs to be reviewed for adding to our list. I can review the terms to be added once you have evaluated. Final Glycosyltransferase.csv
From Sriram Neelamegham:
There is a GitHub repository for it.
I am not sure if @rajamazumder and @mtiemeyer0919 already made a decision.
@jeet-vora can you please have a look and we can disucss this in one of the Wednesday meetings.