glygener / glygen-issues

Repository for public GlyGen tickets
GNU General Public License v3.0
0 stars 0 forks source link

Generate and use glycan and protein schema #267

Closed ReneRanzinger closed 7 months ago

ReneRanzinger commented 1 year ago

Based on the work in #265 and #266 generate the glycan and protein schema. These schema files should be used in the automated testing to validate the glycan and protein details.

Dependencies:

rykahsay commented 1 year ago

@jeet-vora @kmartinez834 ... have you finished tickets #265 and #266 ?

kmartinez834 commented 1 year ago

@rykahsay still working on it, these tickets were moved to end of May

jeet-vora commented 11 months ago

Robel The schema files for the glycan and protein details are in the /data/projects/glygen/generated/misc/ with the required properties and description. You can process it.

schema_protein_detail.csv schema_glycan_detail.csv

rykahsay commented 11 months ago

Please check 2.1 branch on the API repo

image
kmartinez834 commented 10 months ago

Several required *url paths on the csv are not required in the glycan schema. Was this done intentionally?

Examples: "species.*.evidence.*.url","yes","evidence database url" "interactions.*.evidence.*.url","yes","evidence database url" "dictionary.evidence.*.url","yes","evidence database url"

kmartinez834 commented 10 months ago

"gene","yes","gene information for protein" "isoforms","yes","list of protein isoforms"

"required": [
        "mass",
        "sequence",
        "species",
        "uniprot"
    ]
rykahsay commented 8 months ago

@jeet-vora @kmartinez834 --- please update the csv files and assign the ticket back to me when you are done

kmartinez834 commented 8 months ago

misc/schema_glycan_detail.csv is updated

@jeet-vora I can add biomarkers to the proteins schema if you update the permissions for misc/schema_protein_detail.csv

jeet-vora commented 8 months ago

@kmartinez834 Can you check now. Thanks

kmartinez834 commented 8 months ago

Added the following lines to schema_protein_detail.csv. Please check.

"biomarkers","no","biomarkers associated with the glycan"
"biomarkers.*.assessed_biomarker_entity","yes","name of the biomarker entity"
"biomarkers.*.biomarker_id","yes",biomarker id"
"biomarkers.*.evidence","yes","source information"
"biomarkers.*.evidence.*.database","yes","evidence database name"
"biomarkers.*.evidence.*.id","yes","evidence database id"
"biomarkers.*.evidence.*.url","yes","evidence database url"
"biomarkers.*.instances","yes","biomarker instances"
"biomarkers.*.instances.*.best_biomarker_type","yes","biomarker category according to fda best glossary"
"biomarkers.*.instances.*.disease","yes","list of diseases associated with the biomarker"
"biomarkers.*.instances.*.disease.disease_id","yes","identifier for disease from disease specific database"
"biomarkers.*.instances.*.disease.recommended_name","yes","recommended name for the disease"
"biomarkers.*.instances.*.disease.recommended_name.description","no","definition or description of the disease"
"biomarkers.*.instances.*.disease.recommended_name.id","yes","identifier for the disease"
"biomarkers.*.instances.*.disease.recommended_name.name","yes","recommended name for the disease"
"biomarkers.*.instances.*.disease.recommended_name.resource","yes","resource for the recommended disease name"
"biomarkers.*.instances.*.disease.recommended_name.url","yes","url for the disease identifier in the disease database"
"biomarkers.*.instances.*.disease.synonyms","no","synonyms for the disease"
"biomarkers.*.instances.*.disease.synonyms.*.description","no","definition or description of the disease"
"biomarkers.*.instances.*.disease.synonyms.*.id","yes","identifier for the disease"
"biomarkers.*.instances.*.disease.synonyms.*.name","yes","synonyms name for the disease"
"biomarkers.*.instances.*.disease.synonyms.*.resource","yes","resource for the synonym disease name"
"biomarkers.*.instances.*.disease.synonyms.*.url","yes","url for the disease identifier in the disease database"
"biomarkers.*.instances.*.evidence","yes","source information"
"biomarkers.*.instances.*.evidence.*.database","yes","evidence database name"
"biomarkers.*.instances.*.evidence.*.id","yes","evidence database id"
"biomarkers.*.instances.*.evidence.*.url","yes","evidence database url"
"biomarkers.*.instances.*.status","yes","describes the change in the biomarker entity"
"biomarkers.*.instances.*.tissue","yes","tissue where the biomarker entity is measured"
"biomarkers.*.instances.*.tissue.id","yes","tissue ontology id"
"biomarkers.*.instances.*.tissue.name","yes","tissue name"
"biomarkers.*.instances.*.tissue.namespace","yes","tissue ontology namespace"
"biomarkers.*.instances.*.tissue.url","yes","tissue url"

"glycosylation.*.site_lbl","no","glycosylation site position and amino acid"

"phosphorylation.*.site_lbl","yes","phosphorylation site position and amino acid"

"snv.*.site_lbl,"yes","snv site position and amino acid"

"section_stats","yes","section record counts"
"section_stats.*.sort_fields","yes","field order for front end display"
"section_stats.*.table_id","yes","table name"
"section_stats.*.table_stats","yes","table fields and counts"
"section_stats.*.table_stats.*.count","yes","counts for table summary"
"section_stats.*.table_stats.*.field","yes","field name"
jeet-vora commented 8 months ago

Looks good.

kmartinez834 commented 8 months ago

@rykahsay --> The schema files for glycan and protein details are updated. You can reprocess.

misc/schema_protein_detail.csv misc/schema_glycan_detail.csv

rykahsay commented 8 months ago

Done, please check

kmartinez834 commented 7 months ago

👍