Open tskir opened 4 years ago
In additional to investigating, it may help to document how ZOOMA is used
Hi, this is not a bug, it is just a side effect of the way that the ZOOMA queries have been set up at this point. Right now, we are skipping the "ClinVar" data source in ZOOMA queries, well, because this is the data source we are curating. That is why issue #74 needs to be resolved before going into production.
Here are the results in ZOOMA when skipping the ClinVar data source:
That makes sense. It is interesting to see that when ZOOMA uses the curated data sources it also retrieves partial matches. Just for comparison, the mapping algorithm we use in our OnToma tool first checks the curated mappings for exact matches only and if not found it uses ZOOMA to search in specific ontologies, so we would never get all those mappings for "Spastic paraplegia 48, autosomal recessive". Anyway, this is a manual annotation tool and the approach is fine as long as it is documented.
Thank you both!
@joj0s Thanks a lot for clarifying why this happens!
@AsierGonzalez Yes, while I don't know the exact algorithm used by ZOOMA, it definitely does much more than just exact matches. Also, thank you for pointing out the OnToma tool, I think I've never heard about this before—perhaps we could look into ways to integrate your tool & ours into the same platform eventually
That would be great. Feel free to have a look into OnToma or I can give you an overview. Honestly, you can make everything it does by calling the ZOOMA and OLS APIs but it's very convenient for us. Its development has been frozen until a few months ago, when I made some updates to it so if you have any suggestions I am happy to look into them.
Reported by @AsierGonzalez via Slack
I have been exploring the mappings for Spastic paraplegia 48, autosomal recessive in http://trait-curation-1.duckdns.org/traits/204/
The tool has 16 suggestions, which is a lot compared with the single mapping that the Zooma web tool returns (Orphanet_306511, via EVA) and the 5 returned by the Zooma API. For the latter I have tried to reverse engineer the query used by the tool but I may have got it wrong.