Closed kmartinez834 closed 6 months ago
BCO ID | Dataset Name | Creator Name | Notes |
---|---|---|---|
GLY_000965 | chicken_proteoform_glycosylation_sites_uniprotkb.csv | Urnisha | |
GLY_000966 | chicken_protein_ncbi_linkouts.csv | Urnisha | |
GLY_000967 | chicken_protein_genelocus.csv | Urnisha | |
GLY_000993 | chicken_protein_info_uniprotkb.csv | Urnisha | |
GLY_000994 | chicken_protein_genenames_uniprotkb.csv | Urnisha | |
GLY_000995 | chicken_protein_function_refseq.csv | Urnisha | |
GLY_000989 | chicken_protein_citations_uniprotkb.csv | Kate | |
GLY_000990 | chicken_protein_altnames.csv | Kate | |
GLY_001019 | human_proteoform_citations_glycosylation_sites_embl.csv | Karina | |
GLY_000996 | chicken_protein_xref_oglcnac_atlas.csv | Urnisha | |
GLY_000991 | chicken_protein_xref_rhea.csv | Kate | |
GLY_001024 | chicken_protein_glycohydrolase.csv | Jingyue | |
GLY_000979 | chicken_proteoform_glycosylation_sites_literature.csv | Jingyue | |
GLY_000997 | chicken_protein_info_refseq.csv | Urnisha | |
GLY_001021 | mouse_proteoform_citations_glycosylation_sites_embl.csv | Karina | |
GLY_000998 | chicken_protein_masterlist.csv | Urnisha | |
GLY_000960 | human_proteoform_glycosylation_sites_diabetes_glycomic.csv | Karina | |
GLY_000980 | chicken_proteoform_glycosylation_sites_glyconnect.csv | Jingyue | |
GLY_000982 | chicken_proteoform_citations_phosphorylation_sites_uniprotkb.csv | Jingyue | |
GLY_000992 | chicken_protein_xref_brenda.csv | Kate | |
GLY_001001 | chicken_protein_xref_bgee.csv | Kate | |
GLY_000889 | mouse_proteoform_glycosylation_sites_embl.csv | Karina | |
GLY_001002 | chicken_protein_proteinnames_refseq.csv | Kate | |
GLY_000983 | chicken_protein_xref_refseq.csv | Jingyue | |
GLY_000984 | chicken_protein_transcriptlocus.csv | Jingyue | |
GLY_000985 | chicken_protein_xref_kegg.csv | Jingyue | |
GLY_000986 | chicken_protein_go_annotation.csv | Jingyue | |
GLY_001003 | chicken_proteoform_citations_glycation_sites_uniprotkb.csv | Kate | |
GLY_001027 | human_proteoform_citations_glycosylation_sites_pdc_ccrc.csv | Karina | |
GLY_000987 | chicken_protein_participants_reactome.csv | Jingyue | |
GLY_001008 | chicken_protein_submittednames.csv | Jingyue | |
GLY_001009 | chicken_proteoform_glycosylation_sites_pdb.csv | Jingyue | |
GLY_001006 | chicken_protein_xref_intact.csv | Kate | |
GLY_001012 | chicken_protein_xref_cdd.csv | Jingyue | I didn't find the template on data.glygen.org |
GLY_001010 | chicken_protein_citations_refseq.csv | Jingyue | |
GLY_001013 | chicken_protein_recnames.csv | Jingyue | |
GLY_001015 | chicken_proteoform_citations_glycosylation_sites_uniprotkb.csv | Jingyue | |
GLY_001022 | chicken_protein_reactions_reactome.csv | Jingyue | |
GLY_001020 | chicken_protein_xref_geneid.csv | Jingyue | |
GLY_001023 | chicken_proteoform_phosphorylation_sites_uniprotkb.csv | Jingyue | |
GLY_000888 | human_proteoform_glycosylation_sites_embl.csv | Karina | |
GLY_001025 | chicken_protein_reactions_rhea.csv | Jingyue | |
GLY_000961 | human_proteoform_glycosylation_sites_pdc_ccrc.csv | Karina | |
GLY_000978 | chicken_protein_xref_orthodb.csv | Luke | |
GLY_000977 | chicken_protein_pro_annotation.csv | Luke | |
chicken_protein_signalp_peptidesequences.fasta | Not required. part of chicken_protein_signalp_annotation.csv | ||
GLY_000976 | chicken_proteoform_glycosylation_sites_literature_mining.csv | Luke | |
GLY_001007 | chicken_protein_site_annotation_uniprotkb.csv | Kate | |
GLY_000975 | chicken_protein_binary_interactions.csv | Luke | |
GLY_000974 | chicken_protein_xref_pro.csv | Luke | |
GLY_000973 | chicken_protein_xref_glyconnect.csv | Luke | |
GLY_000972 | chicken_protein_glycosylation_motifs.csv | Luke | |
GLY_000971 | chicken_protein_canonicalsequences.fasta | Luke | |
GLY_001016 | human_proteoform_citations_glycosylation_sites_diabetes_glycomic.csv | Karina | |
chicken_protein_signalp_fullsequences.fasta | not required. Part of chicken_protein_signalp_annotation.csv | ||
GLY_000970 | chicken_protein_xref_oma.csv | Luke | |
chicken_protein_signalp_cleavedsequences.fasta | Not required. Part of chicken_protein_signalp_annotation.csv | ||
GLY_000969 | chicken_protein_xref_panther.csv | Luke | |
GLY_001011 | chicken_proteoform_citations_glycosylation_sites_oglcnac_mcw.csv | Kate | |
GLY_000968 | chicken_protein_pathways_reactome.csv | Luke | |
GLY_001014 | chicken_protein_signalp_annotation.csv | Kate | |
GLY_001026 | chicken_proteoform_citations_phosphorylation_sites_iptmnet.csv | kate | |
GLY_001028 | chicken_protein_citations_reactome.csv | Cyrus | |
GLY_001029 | chicken_protein_enzyme_annotation_uniprotkb.csv | Cyrus | |
GLY_001030 | chicken_protein_ptm_annotation_uniprotkb.csv | Cyrus | |
GLY_001031 | chicken_protein_xref_pfam.csv | Cyrus | |
GLY_001032 | chicken_proteoform_citations_glycosylation_sites_oglcnac_atlas.csv | Cyrus | |
GLY_001033 | chicken_proteoform_phosphorylation_sites_iptmnet.csv | Cyrus | |
GLY_001034 | chicken_protein_ntdata.nt | Cyrus | |
GLY_001035 | chicken_protein_function_uniprotkb.csv | Cyrus | |
GLY_001036 | chicken_proteoform_glycosylation_sites_literature_mining_manually_verified.csv | Cyrus | |
GLY_001018 | chicken_protein_genenames_refseq.csv | Kate | |
GLY_001037 | chicken_protein_xref_uniprotkb.csv | Cyrus | |
GLY_001038 | chicken_protein_participants_rhea.csv | Cyrus | |
GLY_001039 | chicken_protein_sequenceinfo.csv | Cyrus | |
GLY_000999 | chicken_protein_glycosyltransferase.csv | Urnisha | |
GLY_001000 | chicken_protein_xref_pdb.csv | Urnisha | |
GLY_001004 | chicken_protein_xref_interpro.csv | Urnisha | |
GLY_001005 | chicken_protein_allsequences.fasta | Urnisha | |
GLY_001040 | chicken_protein_xref_cazy.csv | Cyrus | |
GLY_001041 | chicken_protein_xref_reactome.csv | Cyrus | |
GLY_001042 | chicken_protein_xref_chembl.csv | Cyrus | |
GLY_001045 | human_proteoform_ml_ready_diabetes_glycomic.csv | Karina | |
GLY_001046 | human_proteoform_ml_ready_pdc_ccrc.csv | Karina | |
GLY_001044 | yeast_protein_pathways_reactome.csv | Karina | |
GLY_001043 | yeast_protein_xref_reactome.csv | Karina | |
GLY_001047 | rat_protein_matrixdb.csv | Karina | |
GLY_001048 | rat_protein_citations_matrixdb.csv | Karina |
Please claim a chunk of these BCOs to create by end of next week (5/17)
Perhaps it would be good to add another column to the table, where we can add our names to the BCOs were working on?
@katewarner you can use the "Creator name" column for those you'll work on, thanks!
@kmartinez834 Sorry, I could only see the first two columns in the ticket - I'm an idiot :-D
Please claim atleast 12 BCOs by Monday 05/13 or they will be automatically assigned to you.
@jeet-vora I accidentally created Chicken Glycosylation Sites (GlyConnect) BCO twice, can you please delete https://biocomputeobject.org/GLY_000981/DRAFT?
@JingyueWu I do not have permission to delete it. @tiwa1154 has done it.
Make sure BCOs are in line with #1091
@JingyueWu @CyrusAY can you take a look at your BCOs below? The following are resulting in errors:
$ python3 /software/glygen/check-bco2filename-mapping.py | grep ERROR
NO-BCO,chicken_protein_glycohydrolase.csv,ERROR,in_fs
NO-BCO,chicken_proteoform_glycosylation_sites_literature.csv,ERROR,in_fs
NO-BCO,chicken_protein_submittednames.csv,ERROR,in_fs
NO-BCO,chicken_proteoform_glycosylation_sites_pdb.csv,ERROR,in_fs
NO-BCO,chicken_protein_xref_cdd.csv,ERROR,in_fs
NO-BCO,chicken_protein_citations_refseq.csv,ERROR,in_fs
NO-BCO,chicken_protein_recnames.csv,ERROR,in_fs
NO-BCO,chicken_proteoform_citations_glycosylation_sites_uniprotkb.csv,ERROR,in_fs
NO-BCO,chicken_protein_reactions_reactome.csv,ERROR,in_fs
NO-BCO,chicken_protein_xref_geneid.csv,ERROR,in_fs
NO-BCO,chicken_proteoform_phosphorylation_sites_uniprotkb.csv,ERROR,in_fs
NO-BCO,chicken_protein_reactions_rhea.csv,ERROR,in_fs
NO-BCO,chicken_proteoform_citations_glycosylation_sites_oglcnac_atlas.csv,ERROR,in_fs
NO-BCO,chicken_proteoform_glycosylation_sites_literature_mining_manually_verified.csv,ERROR,in_fs
@rykahsay we need to fix a few BCOS (above), and I'm adding more details to my ML ready BCOs - do you mind waiting til EOB Monday to to create the BCO objects?
Last two of the BCO that showed error were mine.
Fixed chicken_proteoform_citations_glycosylation_sites_oglcnac_atlas.csv
Updating the IO domain of chicken_proteoform_glycosylation_sites_literature_mining_manually_verified.csv
(GLY_001036). Which of the following datasets should I include as the input?
Update: fixed
@CyrusAY you can actually leave the input domain blank (or as-is). Robel will override it programmatically. Can you verify that the output domain is correct? Looks like it's currently human instead of chicken
GLY_000979 | chicken_proteoform_glycosylation_sites_literature.csv should not be included in the chicken dataset. It is based off literatures specific to human only.
Usability domain reads:
The dataset provides information on N-glycosylation sites on Chicken proteins. The data has been processed from the supplementary material from 2 publications (1. "Deeb, S. J., Cox, J., Schmidt-Supprian, M., & Mann, M. (2013). N-linked Glycosylation Enrichment for In-depth Cell Surface Proteomics of Diffuse Large B-cell Lymphoma Subtypes. Molecular & Cellular Proteomics, 13(1), 240-251. doi:10.1074/mcp.m113.033977" 2. "Boersema, P. J., Geiger, T., Winiewski, J. R., & Mann, M. (2012). Quantification of the N-glycosylated Secretome by Super-SILAC During Breast Cancer Progression and in Chicken Blood Samples. Molecular & Cellular Proteomics, 12(1), 158-171. doi:10.1074/mcp.m112.023614"). The listed proteins (UniProtKB) accessions are part of the GlyGen UniProtKB canonical list (https://data.glygen.org/GLYDS000001).
Both literatures (the keyword 'human' is automatically changed to 'chicken' in the latter) were focusing on cancer patients.
On data.glygen.org there is another sarscov1 dataset with empty desc domain, might worth checking:
GLY_000612 SARS-CoV1 Glycosylation Sites (Literature) sarscov1_proteoform_glycosylation_sites_literature.csv
$ python3 check-bco2filename-mapping.py | grep ERROR
NO-BCO,chicken_proteoform_glycosylation_sites_literature.csv,ERROR,in_fs
NO-BCO,rat_protein_citations_matrixdb.csv,ERROR,in_fs
NO-BCO,chicken_proteoform_glycosylation_sites_unicarbkb.csv,ERROR,in_fs
NO-BCO,rat_protein_matrixdb.csv,ERROR,in_fs
I have reviewed all of the new BCOs and created BCOs for rat_protein_matrixdb.csv and rat_protein_citations_matrixdb.csv. The remaining errors (chicken_proteoform_glycosylation_sites_literature.csv and chicken_proteoform_glycosylation_sites_unicarbkb.csv) are due to the presence of these files in unreviewed/ however the files are empty and were removed from the dataset-masterlist.json file.