Closed kinlane closed 3 years ago
Makes sense. I will be looking more closely at the Search documentation soon, it is a super important part of the spec.
I'm recommending that /search be broken off from HSDA into it's own specification. I'd like to break /organizations, /locations, /services into their own microservices, but I feel its too far down the road for that. However /search can be severed at this point into a separate concern. Core resources will be all about managing those resources, where search will be all about search across everything. Similar to bulk loading, the separation of concerns will help move projects forward independently, as opposed to a monolith.
@NeilMcKechnie: @greggish suggests that we compare notes on search APIs. I have a design up for our implementation here https://github.com/pg-irc/pathways-backend/issues/163, and I would be very interested in any comments or suggestions you might have on that, as well as anything you can share about how you are building search in your implementation.
Thanks @rasmus-storjohann-PG . I like where you are heading with your concepts. Here's my originally articulated list of gaps with HSDA's Search definition: https://github.com/openreferral/api-specification/issues/62
Regarding your ideas, I would like to take some time to think about them. I tend to prefer putting parameters in the request body for a little more structure and segmentation versus putting them in querystring parameters. It has the added benefit of being encrypted when using HTTPS, whereas URLs are not encrypted. But that is a comment on just one small part of your thorough writeup. Let me try to find some time to dive in more deeply, or perhaps we connect by voice with others interested in advancing the Search definition in HSDA to get some progress on it.
Passing search arguments in the body so it's encrypted is an important point I had not considered. Can it still be GET then? Some say no. Using POST for search feels awkward. I had already seen your comments on #62 and incorporated quite a bit, it was very helpful.
@NeilMcKechnie I just added this to our search story in the pathways repo, duplicating it here:
There are arguments for passing the search parameters in the request body. The main benefit is that they would be encrypted over HTTPS, which is admittedly a very significant advantage. There may also be some benefit to having more structure to the search parameters represented in JSON than in the URL. Since passing search parameters in the body of a GET request is inconsistent with standard GET semantics, we would need to use POST for searches. This means we can't use /services/ as the endpoint for search, since POST to /services/ creates a new service entity. It seems decidedly more intuitive to search using a GET request to /services/ than a POST request to something like /services/search/. The POST approach will also render caching ineffective.
So what is more important, an intuitive and cached API or confidentiality of the search parameters? I'm torn. I guess it's always possible to do both, in which case I'd suggest the POST end point be something like /services/secure_search/ to make the motivation of the two end points clear.
@rasmus-storjohann-PG you raise some very interesting points! I don't have a strong suggestion on which way(s) to go, but lots to consider.
I have some questions in regard to the /search
endpoint. As an API consumer, I am unsure what I would expect to get back from that end point. @kinlane says above that it would return organizations, locations, services and contacts. Seems to me that this would necessitate additional logic in the client to interpret the result from the call to figure out what it contains. I would like to see some discussion on how this approach is preferable to exposing search functionality on the /organizations
, /locations
, /services
and /contacts
end points. This is the route we have taken in our implementation, and it has been quite straightforward to build without code duplication. When using these end points, it is pretty unambiguous what data they return, so I think the API might be easier to use. I'd also avoid having to write separate response parsing logic for search results. Thoughts?
I have seen the idea of wanting to find anything that has a search term in it ("Google-style"), so that I can search for a term and get back all matching organizations, locations, services and contacts. I think this is a very important use case. However, I think this functionality can also be available on the /organizations
, /locations
, etc., end points, with the meaning that I want all the e.g. organizations that match the search term, including organizations whose related location, service or contact records contain the term. As a separate concern on the API we can have a mechanism for specifying that these related entities should be included in the response. I think this API would be more predictable and would give the client greater ability retrieve the data they need and to specify exactly what details they want in the response.
Hi @rasmus-storjohann-PG , I don't think this is an either/or scenario. I like the idea of targeting specific object types for search via their endpoints. And I agree there should be the ability to search /all ( or did we decide on /complete ?) as you also point out. There are a lot of use cases that will benefit from either or both approaches.
This is somewhat related to another discussion about what fields are returned in a search, here: https://github.com/openreferral/api-specification/issues/21 In particular I think @klambacher 's answer was a good approach.
Then there's the additional topic of, when you return a particular object in a search, what related objects should also be returned? I'm not finding that one at the moment but all three of these topics are very important to work out and agree upon an implementation.
Hi @NeilMcKechnie . I agree that @klambacher raises some very good points. I think it is a very important observation that different API consumers rarely align on what fields they include in abbreviated responses. I'm worried that that invalidates the approach of requesting summary or complete records, since the summary response with hard-coded fields to include may not be of value to anyone. On the other hand, having to specify the fields to return in every call seems cumbersome, although I haven't worked with it myself so I don't really know.
Customizing this on the account level (or maybe the session) is an interesting approach. I'm wondering how that would work with caching: if two different clients with different default fields set on their accounts make identical GET calls, would caching not cause them both to get the same response? I guess not, since they'd have different tokens/cookies/API keys in the request.
I'm definitely interested in seeing the details of how @klambacher's API is used to specify the fields to include in the responses, since I think we will need that in the standard on way or another.
Finally, let's really make sure that we don't just re-invent GraphQL.
I just found out about ALISS in Scotland, which apparently based their data model in part on Open Referral, and also has one of the best documented resource data APIs i've seen yet. Check out their search protocol: https://docs.aliss.org/#service-search
Adding from email from uniteus.com:
The search results should use a consistent hierarchy. My preference is that it will start with organizations as the top level and then services as the 2nd hierarchy level and locations are optionally the 3rd level (in case the organization provides the service at multiple locations). I would not like the added complexity of supporting various hierarchies
The Taxonomy should be well defined and cannot be left for the implementation to decide. Practically this means to have the ability to retrieve the vocabulary attribute which can be one of several predefined values (e.g. AIRS, ICD10, OpenEligibility, …) and expect the search results to always use the same taxonomy. I do not expect providers to classify themselves using more than a single taxonomy and we, on our side, have to provide a uniform search. Mapping different taxonomies and converting between them is complicated and results with many inaccuracies.
Ok, finally making time while traveling this week to load up HSDA + HSDA Search in my head (again), so I can intelligently approach this from the 250K level. I'm working on a more formal response, but wanted to address this thread properly.
First, thank you for the amazing conversation and feedback that has occurred here, and for patience as I'm hustling elsewhere. Lots to think about here.
Next, I'd say I should rebuild the base for this conversation, which will answer some of the questions raised.
While HSDA Search is a separate service, search does exist across core resources -- as @NeilMcKechnie said, this isn't an either/or conversation, and it is my intention to maintain parity between HSDA and HSDA Search, as well as HSDA taxonomy as much as I can.
From HSDA you can search using these paths:
/contacts/ /contacts/complete/
/locations/ /locations/complete/
/organizations/ /organizations/complete/
/services/ /services/complete/
While not present now, there was discussion around extending this to:
/phones/ /programs/
All of these endpoints allow searching via the following two query parameters:
This the default resource based search present in the HSDA guidance.
This discussion overlaps with that, but is focused on a separate project for HSDA Search. With the intention of keeping the basic search features available at each individual endpoint, and expanding more advanced search functionality within its own guidance, with just a single endpoint:
/search/
Which spans all the core resources:
Aggregating search across all resources, and something we should consider expanding to contacts, and maybe also phones, programs.
This addresses some of @rasmus-storjohann-PG questions above about where to conduct search.
I really think the discussions above about GET v POST for search were important. Excellent discussion and points made. I am a support of query based as default, because it is simpler for the spreadsheet and non-coder audience. I also support the security of a POST body search, because of the security, but you are moving the accessibility of search a shelf or two higher, out of reach of non-developer or tool user. So I support both, as Erasmus points out, apply /secure_search/, but across the spectrum of:
HSDA:
/contacs/secure_search/ /contacts/secure_search/complete/
/locations/secure_search/ /locations/secure_search/complete/
/organizations/secure_search/ /organizations/secure_search/complete/
/services/secure_search/ /services/secure_search/complete/
HSDA Search
/search/secure_search/
Providing simple, complete, and aggregate access to secure search, something I think will also play into future "saved search" conversations.
There are other unaddressed threads here which I will tackle separately -- I just wanted to process the platform level elements that already exist, and cherry pick the secure search element here.
I want to begin showing the potential of the OpenAPI contracts for HSDA and HSDA Search, helping demonstrate how OpenAPI can help make our conversations more precise.
I'm using @greggish reference to ALISS above to begin showing the potential for more precise feedback on HSDA and sub-specifications. I took their documentation and generate three separate OpenAPI (fk Swagger) specifications:
I am going to use them as feedback against two of the HSDA OpenAPI contracts:
Using OpenAPI gives me an apples to apples comparison, when considering feedback. I can compare the following elements:
I can also get more precise on media types, enumerators, and other things, making our exchange more productive. Let's get to work on the three specifications above.
This would be a straight comparison with HSDA /services/, but provide considerations that can be applied to HSDA Search.
postcode parameter
- in: query
required: true
type: string
name: postcode
description: 'The postcode that you wish to find services relevant to.'
my thoughts: seems like this parameter could be added to the /services, /organizations, and /locations paths for HSDA, and to /search for HSDA Search.
q parameter
- in: query
required: false
type: string
name: q
description: 'This is the keywords with which to do a full text search of the services.'
my thoughts: We use query and queries. So not relevant to consider. Also we don't use acronyms as parameters, it helps reduce confusion.
category parameter
- in: query
required: false
type: string
name: category
description: 'The category slugs that you wish to filter the search by.'
my thoughts: we do not have the notion of a category anywhere in spec. I believe it used to be in schema? correct me if I'm wrong. Are we going to use taxonomy instead??
location_type parameter
- in: query
required: false
type: string
name: location_type
description: 'The location type of the resource, either local or national, default searches everything.'
enum:
- local
- national
my thoughts: we do not have the notion of a location_type as part of the location schema. If added to the schema we can add to the search interface for HSDA and HSDA Search.
radius parameter
- in: query
required: false
type: string
name: radius
description: 'The radius from the postcode that you wish the search to cover, in meters.'
my thoughts: this seems to be part of other service area, and proximity search suggestions. Which seems like it should become a feature of HSDA search, but not HSDA? Will evaluate alongside other threads on this subject.
closing thoughts: ALISS service definition does not match the HSDA service definition. This is where, if I had more time, I would write a diff script to show me what percentage ALISS supports HSDA across paths and definitions.
2. Categories
general thoughts: We do not have notion of categories in HSDA. Same as above on category parameter. Are we going to accomplish this with taxonomy?
general thoughts: This applies to /services/{service_id}/service-area/, /services/{service_id}/service-area/{service_area_id}/, /service-area/, /service-area/{service_area_id}/ paths, as well as the service_area definition. There really isn't a clear comparison, as their definition model for service area is:
- service_area:
description: 'The subcategories'
properties:
code:
description: 'The code of the category.'
type: string
type:
description: 'The type of the category.'
type: string
name:
description: 'The name of the category.'
type: string
where HSDA is:
- service_area:
description: 'Details of the geographic area for which a service is available.'
properties:
id:
description: 'Each service area must have a unique identifier.'
type: string
service_id:
description: 'The identifier of the service for which this entry describes the service area.'
type: string
service_area:
description: 'The geographic area where a service is available. This is a free-text description, and so may be precise or indefinite as necessary.'
type: string
description:
description: 'A more detailed description of this service area. Used to provide any additional information that cannot be communicated using the structured area and geometry fields.'
type: string
required:
- id
I do not see anything to consider here. Correct me if I'm wrong.
Feedback Considered
HSDA V1.3
/organizations/
- in: query
required: true
type: string
name: postal_code
description: 'The postal code to find organizations for.'
/organizations/complete/
- in: query
required: true
type: string
name: postal_code
description: 'The postal code to find organizations for.'
/locations/
- in: query
required: true
type: string
name: postal_code
description: 'The postal code to find locations for.'
/locations/complete/
- in: query
required: true
type: string
name: postal_code
description: 'The postal code to find locations for.'
/services/
- in: query
required: true
type: string
name: postal_code
description: 'The postal code to find services for.'
/services/complete/
- in: query
required: true
type: string
name: postal_code
description: 'The postal code to find services for.'
HSDA Search V1.1
/search/
- in: query
required: true
type: string
name: postal_code
description: 'The postal code to find services for.'
Consider location definitions added to HSDS V1.2:
- in: query
required: false
type: string
name: location_type
description: 'The location type of the resource, either local or national, default searches everything.'
enum:
- local
- national
Closing Thoughts
Using OpenAPI in this way allows me to make ALISS feedback machine readable. It is something I did with AIRS, ICarol, and Health Leads with the HSDA 1.0 release.
While I can't use everything, I did get a couple of things I think should be added into HSDA 1.3 and HSDA Search 1.1. I can take the parameters as defined as part f ALISS OpenAPI, and add to my HSDA 1.3 and HSDA Search 1.1 OpenAPI drafts.
Next, I'll move to profile @rasmus-storjohann-PG Search API master story #163, which like the ALISS one, would be nice if there was an OpenAPI (fka Swagger) present already, but I'll take a crack at creating based upon the design. Something that will get increasingly more difficult with @NeilMcKechnie feedback, and others.
One more set of road map suggestions we might want to consider as part of the next release across two HSDA definition, which I extracted from this thread is @rasmus-storjohann-PG breakdown of the secure search, based upon @NeilMcKechnie preference for the POST > Body powered search over parameter.
I think @rasmus-storjohann-PG vision for /services/secure_search/ needed fleshing out and consideration, so I riffed on in in this way, adding a secure-search for the top 3 resources using a POST w/ body submission.
/organizations/secure-search/:
post:
summary: Organizations Secure Search
description: Search for organizations using a POST of an organization definition using the body.
operationId: secureSearchOrganizations
parameters:
- in: body
name: body
schema:
$ref: '#/definitions/organization_complete'
responses:
'200':
description: Service Response
schema:
type: array
items:
$ref: "#/definitions/organization_complete"
/locations/secure-search/:
post:
summary: Locations Secure Search
description: Search for locations using a POST of an location definition using the body.
operationId: secureSearchLocations
parameters:
- in: body
name: body
schema:
$ref: '#/definitions/location_complete'
responses:
'200':
description: Service Response
schema:
type: array
items:
$ref: "#/definitions/location_complete"
/services/secure-search/:
post:
summary: Services Secure Search
description: Search for locations using a POST of an location definition using the body.
operationId: secureSearchServices
parameters:
- in: body
name: body
schema:
$ref: '#/definitions/service_complete'
responses:
'200':
description: Service Response
schema:
type: array
items:
$ref: "#/definitions/service_complete"
I used the "complete" representation for the request and response definition. Allowing for the most robust secure search as possible. Adding this secure search layer to HSDA.
Here is an OpenAPI for this HSDA Secure Search road map suggestion.
I set out to create an OpenAPI representation of @rasmus-storjohann-PG Search API master story., except I feel it covers a lot of what has been covered here, except:
Let me know any other areas I missed. I think the current query, queries, and HSDA resource search (/organizations, /locations, /services) and HSDA Search cover most bases.
Borrowing from @rasmus-storjohann-PG Service search: sort by proximity #170:
We already have /locations/{location_id}/services/ and /locations/{location_id}/services/{service_id}/ which provide access to half of the services_at_location question in HSDA.
However, we were missing the other half of the coin in HSDA, so I propose we add /services/{service_id}/locations/ and /services/{service_id}/locations/{location_id}/ to complete the services_at_location circle of life - here is the OpenAPI for it.
Borrowing from @rasmus-storjohann-PG Service search: sort by proximity #170 as well as @NeilMcKechnie regarding more precision regarding types, I recommend the following parameters be added to:
HSDA /organizations, /locations, /services and /organizations/complete/, /locations/complete/, /services/complete/ - as define defined in this OpenAPI:
- in: query
name: latitude
required: true
type: number
format: float
description: 'The latitude to search by.'
- in: query
name: longitude
required: true
type: number
format: float
description: 'The longitude to search by.'
- in: query
name: radius
required: true
type: integer
format: int32
description: 'The radius to search by.'
I liked the proximities search but we could get more precise with more precise parameters, and data types.
Borrowing from @rasmus-storjohann-PG Service search: sort by proximity #170 as well as @NeilMcKechnie regarding more precision regarding types, I recommend the following parameters be added to:
HSDA Search /search/ - as define defined in this OpenAPI:
- in: query
name: latitude
required: true
type: number
format: float
description: 'The latitude to search by.'
- in: query
name: longitude
required: true
type: number
format: float
description: 'The longitude to search by.'
- in: query
name: radius
required: true
type: integer
format: int32
description: 'The radius to search by.'
I liked the proximities search but we could get more precise with more precise parameters, and data types.
From @greggish reference to ALISS above I am adding postal_code to the search layer to HSDA for /organizations, /locations, /services and /organizations/complete, /locations/complete, /services/complete:
- in: query
required: true
type: string
name: postal_code
description: 'The postal code to search a specific resource for'
You can view the OpenAPI for it here - https://gist.github.com/kinlane/d29bcb807f9a4cf8c66487430845b754
From @greggish reference to ALISS above I am adding postal_code to the search layer to HSDA Search for /search:
- in: query
required: true
type: string
name: postal_code
description: 'The postal code to search a specific resource for'
You can view the OpenAPI for it here - https://gist.github.com/kinlane/f13133d71a27c0cfe322d07fa187ed0bhttps://gist.github.com/kinlane/d29bcb807f9a4cf8c66487430845b754
Adding programs and contacts to search collection
search:
properties:
organizations:
type: "array"
items:
$ref: "#/definitions/organization"
locations:
type: "array"
items:
$ref: "#/definitions/location"
services:
type: "array"
items:
$ref: "#/definitions/service"
contacts:
type: "array"
items:
$ref: "#/definitions/contact"
programs:
type: "array"
items:
$ref: "#/definitions/program"
Then adding a parameter for selecting the types of resources to return.
- in: query
type: string
name: type
description: The type of resource to return.
default: all
enum:
- all
- organization
- location
- service
- contact
- program
Allowing more control over the search results for any HSDA search - the OpenAPI for this feature addition is here - https://gist.github.com/kinlane/00765ecefb966e4fec799b4d66483f48
Now that we have HSDA taxonomies for managing the taxonomy, and the ability to add, view, and remove taxonomies from a service. We need to be able to search for the across core resources using the name of taxonomy or the taxonomy id:
- in: query
required: true
type: string
name: taxonomy_ids
description: 'Comma separated list of taxonomy ids to search by.'
- in: query
required: true
type: string
name: taxonomies
description: 'Comma separated list of taxonomies to search by.'
This returns all organizations, locations, and services by the services they have taxonomies applied to. You can see the OpenAPI for this here - https://gist.github.com/kinlane/ba0de5614f3d2433cf77f0604b77e85a
Now that we have HSDA taxonomies for managing the taxonomy, and the ability to add, view, and remove taxonomies from a service. As well as the ability to search across core resources using taxonomy, we need to add the same to HSDA search.
- in: query
required: true
type: string
name: taxonomy_ids
description: 'Comma separated list of taxonomy ids to search by.'
- in: query
required: true
type: string
name: taxonomies
description: 'Comma separated list of taxonomies to search by.'
Allowing the search across all organizations, locations, services, and programs using taxonomies - https://gist.github.com/kinlane/2042c0b2e6a50cddf0039d4bb144cb50
Hi @kinlane , quite a flurry of mostly long posts today. Would it be possible to synthesize all of this into a newly revised document or draft specification for the many of us with continued interest? Or maybe these various subtopics need to each be broken out into their own discussion items?
Yes, each area has it's own OpenAPI, and individual issue, with all aggregated here - https://github.com/openreferral/api-specification/issues/84
Working on draft OpenAPI for HSDA v1.3 and HSDA Search v1.1, but had to put down after sprint.
@kinlane can you point me to the issue or an example that has service_at_locations
as a top level JSON object in the response? We're currently working on our search API and have found that service_at_location
is most consistent with the data we have and semantics we're adapting from other systems. Right now we're planning to return a search response with service_at_locations
collection as a top level object along with locations
and services
collections, which will contain the services
and locations
linked to from the service_at_locations
. Does that make sense? Where's the best place to continue a conversation about that idea?
@switzersc check out #87?
Ok. New search OpenAPI is ready for comment. We can start new thread - https://www.postman.com/api-evangelist/workspace/open-referral-human-services-data-api-hsda/documentation/35240-e8cdaa8c-5444-4722-a6fd-181f120d49f3
I will be moving the /search option beyond just returning locations, and give it a collection of organizations, locations, services, and contracts.
I will also give /search an /everything, so that we can have a simple, and expanded results.