Closed baimingze closed 9 years ago
these words have no annotation info
['study', 'sequencing', 'performed', 'mass', 'ms', 'whole', 'identified', 'identify', 'project', 'part', 'genomic', 'including', 'individuals', 'http', 'based', 'cases', 'cohort', 'lc']
{'mergedWords': ['samples', 'sample'], 'label': 'sample', 'frequent': '1320', 'No': '0'}
{'mergedWords': ['analysis'], 'label': 'analysis', 'frequent': '950', 'No': '1'}
{'mergedWords': ['genome'], 'label': 'genome', 'frequent': '828', 'No': '2'}
{'mergedWords': ['cell'], 'label': 'cell', 'frequent': '1004', 'No': '3'}
{'mergedWords': ['human'], 'label': 'human', 'frequent': '642', 'No': '4'}
{'mergedWords': ['cancer'], 'label': 'cancer', 'frequent': '545', 'No': '5'}
{'mergedWords': ['high'], 'label': 'high', 'frequent': '497', 'No': '6'}
{'mergedWords': ['exome'], 'label': 'exome', 'frequent': '455', 'No': '7'}
{'mergedWords': ['dna'], 'label': 'dna', 'frequent': '446', 'No': '8'}
{'mergedWords': ['patients'], 'label': 'patients', 'frequent': '427', 'No': '9'}
{'mergedWords': ['disease'], 'label': 'disease', 'frequent': '414', 'No': '10'}
{'mergedWords': ['genes', 'gene', 'genetic'], 'label': 'gene', 'frequent': '1119', 'No': '11'}
{'mergedWords': ['used'], 'label': 'used', 'frequent': '373', 'No': '13'}
{'mergedWords': ['mutations'], 'label': 'mutations', 'frequent': '364', 'No': '14'}
{'mergedWords': ['proteome'], 'label': 'proteome', 'frequent': '336', 'No': '15'}
{'mergedWords': ['wide'], 'label': 'wide', 'frequent': '320', 'No': '16'}
{'mergedWords': ['spectrometry'], 'label': 'spectrometry', 'frequent': '287', 'No': '17'}
{'mergedWords': ['associated'], 'label': 'associated', 'frequent': '269', 'No': '18'}
@baimingze this work is fantastic my friend, well done. How are you planning to store this information?, As far as I see we have two options:
My opinion is that this information should be in the XML for two main reasons:
We will not change the frequently words part of the web-services because the words will be in the XML-files.
Apart of the MS and MESH ontologies we should explore other important ontologies such as:
- Experimental Factor Ontology - EFO
- Tissue BRENDA ontology
- GO (Gene ontology)
@baimingze we can discuss this issues on thursday.
"self": "http://data.bioontology.org/ontologies/BTO/classes/http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FBTO_0000759","
algorithm
1. get annotation info of a word
2. get the first matched_word ("MODIFICATION") from annotation info (excludes the word not from first char, such as FICATION)
3. deal the urls which comes from bioontology.org and the match is as same as the matched_word in step 2, to get the detail infos(Accession, ontology)
4. collect the synonyms from these urls
results(only with two ontologies: MESH and MS)
synonyms are in the square brackets