cbrl-nuces / GOntoSim

Semantic Similarity
5 stars 1 forks source link

GOntoSim

Source code and datasets of the paper 'GOntoSim: A Semantic Similarity Measure based on LCA and Common Descendants'.

Web server is available at http://www.cbrlab.org/GOntoSim.html.

Dependencies

pip install goatools

Data

Annotations for these annotations have already been downloaded and filtered. These are saved as named tuples required in the code.

The Gene Ontology File (go-basic.obo) used in the experiments is also provided in the same folder.

The association files are for the implementation of Resnik's and Lin's measures.

iPython Notebooks

To replicate the complete set of Experiments

Any one of the following measures can be used to calculate the similarity: 'gontosim', 'baseline', 'lca', 'baselineDesc', 'wang', 'gogo' ('resnik' and 'lin' can be calculated for MF only)

The arguments required are the similarity measure, the GO Aspect (MF or BP), Evidence Code (IEA or NONIEA), number of samples

python GOntoSim.py measure GO_Aspect Evidence_Code Number_of_Samples

Run GOntoSim.py with the following Commands:

Experiment 1:

This experiment uses the Molecular Function GO term annotations (NONIEA) for the Enzymes.

python GOntoSim.py gontosim MF NONIEA 150

Experiment 2:

This experiment uses the Biological Process GO term annotations (NONIEA) for the Enzymes.

python GOntoSim.py gontosim BP NONIEA 150

Experiment 3:

This experiment uses the Molecular Function GO term annotations (IEA) for the Enzymes.

python GOntoSim.py gontosim MF IEA 500

Reference

If you use GOntoSim, we would appreciate a citation:

Kamran, A.B., Naveed, H. GOntoSim: a semantic similarity measure based on LCA and common descendants. Sci Rep 12, 3818 (2022). https://doi.org/10.1038/s41598-022-07624-3

Contact

Contact hammad.naveed@nu.edu.pk or amna.kamran@nu.edu.pk

Feel free to contact us in case of any confusions.