Right now, we calculate our similarity scores on the full Gene Ontology, but some researchers might have a focus on a specific subset. For these users, it might be useful to filter the Gene Ontology and only run on the subset.
There must be a user-friendly way to select a subset. A list of terms to include is a minimum, but we probably also want to say "these terms and all child terms".
On first sight, we could limit the go terms to the similarity calculation phase. This might however give inaccurate results because the other terms would then still be taken into account when calculating the information contents. For optimal results, the filtering should also be applied when generating our initial values from Swissprot. We would need to test both approaches and see if the second option is needed or not.
We would need to talk to some users who have this problem to get a better idea about their needs.
Right now, we calculate our similarity scores on the full Gene Ontology, but some researchers might have a focus on a specific subset. For these users, it might be useful to filter the Gene Ontology and only run on the subset.