Closed f-heeger closed 8 years ago
All mappings go through the open tree taxonomy. You can already map from OTT to source taxonomies via the API. So what's needed here is a TNRS-like service that maps from a source id to an OTT id.
One possibility to investigate would be to outsource this problem to Global Names...
So what's needed here is a TNRS-like service that maps from a source id to an OTT id.
This is what I'm trying to implement. The information is all there, but the api to do the mapping in that direction isn't there and that's what the implementation question is about.
I'm not sure how we would outsource to Global Names - just ship our OTT to source mappings to them.
On 10/14/15 6:46 PM, Jonathan A Rees wrote:
All mappings go through the open tree taxonomy. You can already map from OTT to source taxonomies via the API. So what's needed here is a TNRS-like service that maps from a source id to an OTT id.
One possibility to investigate would be to outsource this problem to Global Names...
— Reply to this email directly or view it on GitHub https://github.com/OpenTreeOfLife/taxomachine/issues/97#issuecomment-148224899.
I don't know how the GN idea would work, either - it would require talking to them. I've been talking to them anyhow; I'll look into it.
Another possibility would be to put a SQL table in the OTI replacement we've been talking about.
But going through taxomachine would be more convenient and maybe more reliable, so the question remains. Do we have any way to estimate the amount of space needed by the new index(es)?
@pmidford you said "This is what I'm trying to implement." - did you make any progress on this, or even just formulate a plan?
@jar398 Let me check this - I remember it didn't seem that difficult, so maybe I misunderstood/missed something.
It's probably not hard, but if you know enough to know it's not hard, you know more than I do! For example, it seems like you'd need to create some kind of index to map the ids. But I don't know how to create indexes in neo4j and wouldn't know where to start.
Yes, I had to learn a fair amount of neo4j to do anything with taxomachine, and I think I had tried setting up an index, but this seemed a lower priority than cleaning up taxomachine tests, so I switched to that.
I took a quick look at the index setup code, and it would be straightforward to add an index on qualified ids, where a qualified id is a string containing a colon e.g. 'ncbi:1234' - one would just follow the pattern used for OTT ids.
This could be exposed in the API through an additional argument to the taxon_info API call, say, 'qualified_id', mutually exclusive with 'ott_id'. The OTT id would be returned in the result (which needs to happen anyhow for #136).
It would be really helpful to have an API call to map between IDs from different source taxonomies. For example: For Index Fungorum ID X, what is the corresponding NCBI taxonomy ID?
This feature request was discussed on the mailing list here (just for reference).