Closed AlexanderPico closed 6 years ago
There isn't documentation on xrefsBatch command yet, so its not clear. As I read it, you can mix source databases, but need a second column to identify the source of each id (which I don't think we have).
I guess we could guess the source on a line by line basis. For many databases, the pattern is specific enough but HGNC matches everything. It would have to be the lowest priority guess.
There is a batch method for mixed queries: http://www.bridgedb.org/swagger/#/ (see post /{organism}/xrefsBatch )
It requires knowing the datasource per ID, which for this use case we do!
Maybe if the user specifies "Mixed", then they could be prompted to specify another column containing the datasources? For example, open any WikiPathways into Cytoscape and you'll see the XrefDatasource providing datasource per row of XrefId. Then it's just a matter of constructing the POST body with the info from these two columns and then a single column of requested ID types should be returned.
Decided this was a use case specific to WikiPathways. Not going to impl here.
The next feature I'd like to see (and would use on a weekly basis) would be support for "mixed" or "unspecified" Map from datasource. That way I could map from a column with more than one ID type to a new column with a single type (i.e., ID unification). I could also map from a columns on unknown type to a new type.
In terms of implementation, can we add "unspecified" and then send in a batch query without specifying the source type? Nuno does this in Goliath, so there must be a way...