Closed IgorRodchenkov closed 6 years ago
A PC search query, for entity type hits, returns an array of pathway URIs in the "pathway" filed (JSON or XML). Currently, we cannot "search" by URI (or part of it) even if we'd know some, because the Lucene index field "uri" is a StoredField (not indexed). So, here is an idea:
If we replace StoredField with StringField on this line, and re-index entire BioPAX model, then queries like
http://www.pathwaycommons.org/pc2/search?q=uri:"http://pathwaycommons.org/pc2/Protein_0d4308790e68d98cdb1ce80c706e2e0e"
will work and return at most one hit with all its pathway URIs (one could also submit a list of uris as q=uri:"A" uri:"B" uri:"C"
).
PS:
However, traverse?uri=entityUri&uri=entityUri2...
seems much easier to implement and does not require re-indexing.
@jvwong @d2fong @gbader
Alright, FYI: @d2fong, @ozgunbabur, @jvwong
PC (beta.pathwaycommons.org/pc2/
- PCv10 server) /search
commad now understands queries like:
search?q=uri:"URI_of_biopax_object"
- gets you one exact hit, at most (so, one can then extract names, parent pathway uris, etc, from the result fields);search?q=pathway:"a_Pathway_URI"
- get all the child objects of given (by URI) parent pathway;
( - search?q=pathway:name_expr - was working even before the latest modifications; but it is fussy and somewhat confusing to use, e.g.: http://beta.pathwaycommons.org/pc2/search?q=pathway:*insulin*&type=control
finds all Control type interactions that belong to any pathway (or sub-pathway of such) which name contains "insulin" - WOW, but...
Of course, it's still possible to submit boolean queries like
?q=pathway:"URI1" AND pathway:"URI1"
(- find smth. that bolongs to both pathways) or ?q=uri:"URI1" OR uri:"URI1"
(- find either or both of the two things by known URI), etc. - go ahead experimenting...)
URI or ID query values in uri:
and pathway:
fields are normally case-sensitive, but names are not.
You can also use only the "ID" (syffix) part of the URI(s) with these search fields, e.g., "R-HSA-201451" or "Protein_0d4308790e68d98cdb1ce80c706e2e0e" (just example - might not be there in PC10 db), etc. Double quotes are also important for these queries (because it does not make much sense to use tricky fussy search in uri: and pathway: fields, and quotation cancels special meaning of some symbols to the Lucene query parser)
This (web service query) could be implemented in one of the following way, or the other, or both:
search?q=uri:"entityUri"&type=pathway
(can search by multiple URIs using q string like '"uri1" "uri2" ...')traverse?uri=entityUri&uri=entityUri2...
(i.e., default to return all the relevant pathways when 'path' parameter is missing, instead of returning )(Or, we could add a new WS endpoint.)