openreferral / api-specification

This is the working repository for Open Referral's Human Services Data API protocols.
https://openreferral.readthedocs.io/en/latest/hsda/
Other
29 stars 13 forks source link

/search #46

Closed kinlane closed 3 years ago

kinlane commented 7 years ago

I will be moving the /search option beyond just returning locations, and give it a collection of organizations, locations, services, and contracts.

I will also give /search an /everything, so that we can have a simple, and expanded results.

NeilMcKLogic commented 7 years ago

Makes sense. I will be looking more closely at the Search documentation soon, it is a super important part of the spec.

kinlane commented 7 years ago

I'm recommending that /search be broken off from HSDA into it's own specification. I'd like to break /organizations, /locations, /services into their own microservices, but I feel its too far down the road for that. However /search can be severed at this point into a separate concern. Core resources will be all about managing those resources, where search will be all about search across everything. Similar to bulk loading, the separation of concerns will help move projects forward independently, as opposed to a monolith.

rasmus-storjohann-PG commented 6 years ago

@NeilMcKechnie: @greggish suggests that we compare notes on search APIs. I have a design up for our implementation here https://github.com/pg-irc/pathways-backend/issues/163, and I would be very interested in any comments or suggestions you might have on that, as well as anything you can share about how you are building search in your implementation.

NeilMcKLogic commented 6 years ago

Thanks @rasmus-storjohann-PG . I like where you are heading with your concepts. Here's my originally articulated list of gaps with HSDA's Search definition: https://github.com/openreferral/api-specification/issues/62

Regarding your ideas, I would like to take some time to think about them. I tend to prefer putting parameters in the request body for a little more structure and segmentation versus putting them in querystring parameters. It has the added benefit of being encrypted when using HTTPS, whereas URLs are not encrypted. But that is a comment on just one small part of your thorough writeup. Let me try to find some time to dive in more deeply, or perhaps we connect by voice with others interested in advancing the Search definition in HSDA to get some progress on it.

rasmus-storjohann-PG commented 6 years ago

Passing search arguments in the body so it's encrypted is an important point I had not considered. Can it still be GET then? Some say no. Using POST for search feels awkward. I had already seen your comments on #62 and incorporated quite a bit, it was very helpful.

rasmus-storjohann-PG commented 6 years ago

@NeilMcKechnie I just added this to our search story in the pathways repo, duplicating it here:

There are arguments for passing the search parameters in the request body. The main benefit is that they would be encrypted over HTTPS, which is admittedly a very significant advantage. There may also be some benefit to having more structure to the search parameters represented in JSON than in the URL. Since passing search parameters in the body of a GET request is inconsistent with standard GET semantics, we would need to use POST for searches. This means we can't use /services/ as the endpoint for search, since POST to /services/ creates a new service entity. It seems decidedly more intuitive to search using a GET request to /services/ than a POST request to something like /services/search/. The POST approach will also render caching ineffective.

So what is more important, an intuitive and cached API or confidentiality of the search parameters? I'm torn. I guess it's always possible to do both, in which case I'd suggest the POST end point be something like /services/secure_search/ to make the motivation of the two end points clear.

NeilMcKLogic commented 6 years ago

@rasmus-storjohann-PG you raise some very interesting points! I don't have a strong suggestion on which way(s) to go, but lots to consider.

rasmus-storjohann-PG commented 6 years ago

I have some questions in regard to the /search endpoint. As an API consumer, I am unsure what I would expect to get back from that end point. @kinlane says above that it would return organizations, locations, services and contacts. Seems to me that this would necessitate additional logic in the client to interpret the result from the call to figure out what it contains. I would like to see some discussion on how this approach is preferable to exposing search functionality on the /organizations, /locations, /services and /contacts end points. This is the route we have taken in our implementation, and it has been quite straightforward to build without code duplication. When using these end points, it is pretty unambiguous what data they return, so I think the API might be easier to use. I'd also avoid having to write separate response parsing logic for search results. Thoughts?

I have seen the idea of wanting to find anything that has a search term in it ("Google-style"), so that I can search for a term and get back all matching organizations, locations, services and contacts. I think this is a very important use case. However, I think this functionality can also be available on the /organizations, /locations, etc., end points, with the meaning that I want all the e.g. organizations that match the search term, including organizations whose related location, service or contact records contain the term. As a separate concern on the API we can have a mechanism for specifying that these related entities should be included in the response. I think this API would be more predictable and would give the client greater ability retrieve the data they need and to specify exactly what details they want in the response.

NeilMcKLogic commented 6 years ago

Hi @rasmus-storjohann-PG , I don't think this is an either/or scenario. I like the idea of targeting specific object types for search via their endpoints. And I agree there should be the ability to search /all ( or did we decide on /complete ?) as you also point out. There are a lot of use cases that will benefit from either or both approaches.

This is somewhat related to another discussion about what fields are returned in a search, here: https://github.com/openreferral/api-specification/issues/21 In particular I think @klambacher 's answer was a good approach.

Then there's the additional topic of, when you return a particular object in a search, what related objects should also be returned? I'm not finding that one at the moment but all three of these topics are very important to work out and agree upon an implementation.

rasmus-storjohann-PG commented 6 years ago

Hi @NeilMcKechnie . I agree that @klambacher raises some very good points. I think it is a very important observation that different API consumers rarely align on what fields they include in abbreviated responses. I'm worried that that invalidates the approach of requesting summary or complete records, since the summary response with hard-coded fields to include may not be of value to anyone. On the other hand, having to specify the fields to return in every call seems cumbersome, although I haven't worked with it myself so I don't really know.

Customizing this on the account level (or maybe the session) is an interesting approach. I'm wondering how that would work with caching: if two different clients with different default fields set on their accounts make identical GET calls, would caching not cause them both to get the same response? I guess not, since they'd have different tokens/cookies/API keys in the request.

I'm definitely interested in seeing the details of how @klambacher's API is used to specify the fields to include in the responses, since I think we will need that in the standard on way or another.

Finally, let's really make sure that we don't just re-invent GraphQL.

greggish commented 6 years ago

I just found out about ALISS in Scotland, which apparently based their data model in part on Open Referral, and also has one of the best documented resource data APIs i've seen yet. Check out their search protocol: https://docs.aliss.org/#service-search

kinlane commented 6 years ago

Adding from email from uniteus.com:

  1. The search results should use a consistent hierarchy. My preference is that it will start with organizations as the top level and then services as the 2nd hierarchy level and locations are optionally the 3rd level (in case the organization provides the service at multiple locations). I would not like the added complexity of supporting various hierarchies

  2. The Taxonomy should be well defined and cannot be left for the implementation to decide. Practically this means to have the ability to retrieve the vocabulary attribute which can be one of several predefined values (e.g. AIRS, ICD10, OpenEligibility, …) and expect the search results to always use the same taxonomy. I do not expect providers to classify themselves using more than a single taxonomy and we, on our side, have to provide a uniform search. Mapping different taxonomies and converting between them is complicated and results with many inaccuracies.

kinlane commented 5 years ago

Ok, finally making time while traveling this week to load up HSDA + HSDA Search in my head (again), so I can intelligently approach this from the 250K level. I'm working on a more formal response, but wanted to address this thread properly.

First, thank you for the amazing conversation and feedback that has occurred here, and for patience as I'm hustling elsewhere. Lots to think about here.

Next, I'd say I should rebuild the base for this conversation, which will answer some of the questions raised.

While HSDA Search is a separate service, search does exist across core resources -- as @NeilMcKechnie said, this isn't an either/or conversation, and it is my intention to maintain parity between HSDA and HSDA Search, as well as HSDA taxonomy as much as I can.

From HSDA you can search using these paths:

/contacts/ /contacts/complete/

/locations/ /locations/complete/

/organizations/ /organizations/complete/

/services/ /services/complete/

While not present now, there was discussion around extending this to:

/phones/ /programs/

All of these endpoints allow searching via the following two query parameters:

This the default resource based search present in the HSDA guidance.

This discussion overlaps with that, but is focused on a separate project for HSDA Search. With the intention of keeping the basic search features available at each individual endpoint, and expanding more advanced search functionality within its own guidance, with just a single endpoint:

/search/

Which spans all the core resources:

Aggregating search across all resources, and something we should consider expanding to contacts, and maybe also phones, programs.

This addresses some of @rasmus-storjohann-PG questions above about where to conduct search.

I really think the discussions above about GET v POST for search were important. Excellent discussion and points made. I am a support of query based as default, because it is simpler for the spreadsheet and non-coder audience. I also support the security of a POST body search, because of the security, but you are moving the accessibility of search a shelf or two higher, out of reach of non-developer or tool user. So I support both, as Erasmus points out, apply /secure_search/, but across the spectrum of:

HSDA:

/contacs/secure_search/ /contacts/secure_search/complete/

/locations/secure_search/ /locations/secure_search/complete/

/organizations/secure_search/ /organizations/secure_search/complete/

/services/secure_search/ /services/secure_search/complete/

HSDA Search

/search/secure_search/

Providing simple, complete, and aggregate access to secure search, something I think will also play into future "saved search" conversations.

There are other unaddressed threads here which I will tackle separately -- I just wanted to process the platform level elements that already exist, and cherry pick the secure search element here.

kinlane commented 5 years ago

I want to begin showing the potential of the OpenAPI contracts for HSDA and HSDA Search, helping demonstrate how OpenAPI can help make our conversations more precise.

I'm using @greggish reference to ALISS above to begin showing the potential for more precise feedback on HSDA and sub-specifications. I took their documentation and generate three separate OpenAPI (fk Swagger) specifications:

  1. Service Search
  2. Categories
  3. Service Areas

I am going to use them as feedback against two of the HSDA OpenAPI contracts:

Using OpenAPI gives me an apples to apples comparison, when considering feedback. I can compare the following elements:

I can also get more precise on media types, enumerators, and other things, making our exchange more productive. Let's get to work on the three specifications above.

1. Service Search

This would be a straight comparison with HSDA /services/, but provide considerations that can be applied to HSDA Search.

postcode parameter

    - in: query
      required: true
      type: string
      name: postcode
      description: 'The postcode that you wish to find services relevant to.'

my thoughts: seems like this parameter could be added to the /services, /organizations, and /locations paths for HSDA, and to /search for HSDA Search.

q parameter

    - in: query
      required: false
      type: string
      name: q
      description: 'This is the keywords with which to do a full text search of the services.'   

my thoughts: We use query and queries. So not relevant to consider. Also we don't use acronyms as parameters, it helps reduce confusion.

category parameter

    - in: query
      required: false
      type: string
      name: category
      description: 'The category slugs that you wish to filter the search by.' 

my thoughts: we do not have the notion of a category anywhere in spec. I believe it used to be in schema? correct me if I'm wrong. Are we going to use taxonomy instead??

location_type parameter

    - in: query
      required: false
      type: string
      name: location_type
      description: 'The location type of the resource, either local or national, default searches everything.' 
      enum:
        - local
        - national

my thoughts: we do not have the notion of a location_type as part of the location schema. If added to the schema we can add to the search interface for HSDA and HSDA Search.

radius parameter

    - in: query
      required: false
      type: string
      name: radius
      description: 'The radius from the postcode that you wish the search to cover, in meters.'   

my thoughts: this seems to be part of other service area, and proximity search suggestions. Which seems like it should become a feature of HSDA search, but not HSDA? Will evaluate alongside other threads on this subject.

closing thoughts: ALISS service definition does not match the HSDA service definition. This is where, if I had more time, I would write a diff script to show me what percentage ALISS supports HSDA across paths and definitions.

2. Categories

general thoughts: We do not have notion of categories in HSDA. Same as above on category parameter. Are we going to accomplish this with taxonomy?

3. Service Areas

general thoughts: This applies to /services/{service_id}/service-area/, /services/{service_id}/service-area/{service_area_id}/, /service-area/, /service-area/{service_area_id}/ paths, as well as the service_area definition. There really isn't a clear comparison, as their definition model for service area is:

- service_area:
    description: 'The subcategories'
    properties:
      code:
        description: 'The code of the category.'
        type: string
      type:
        description: 'The type of the category.'
        type: string
      name:
        description: 'The name of the category.'
        type: string

where HSDA is:

- service_area:
    description: 'Details of the geographic area for which a service is available.'
    properties:
      id:
        description: 'Each service area must have a unique identifier.'
        type: string
      service_id:
        description: 'The identifier of the service for which this entry describes the service area.'
        type: string
      service_area:
        description: 'The geographic area where a service is available. This is a free-text description, and so may be precise or indefinite as necessary.'
        type: string
      description:
        description: 'A more detailed description of this service area. Used to provide any additional information that cannot be communicated using the structured area and geometry fields.'
        type: string
    required:
      - id

I do not see anything to consider here. Correct me if I'm wrong.

Feedback Considered

HSDA V1.3

/organizations/

    - in: query
      required: true
      type: string
      name: postal_code
      description: 'The postal code to find organizations for.'    

/organizations/complete/

    - in: query
      required: true
      type: string
      name: postal_code
      description: 'The postal code to find organizations for.'    

/locations/

    - in: query
      required: true
      type: string
      name: postal_code
      description: 'The postal code to find locations for.'    

/locations/complete/

    - in: query
      required: true
      type: string
      name: postal_code
      description: 'The postal code to find locations for.'  

/services/

    - in: query
      required: true
      type: string
      name: postal_code
      description: 'The postal code to find services for.'    

/services/complete/

    - in: query
      required: true
      type: string
      name: postal_code
      description: 'The postal code to find services for.'    

HSDA Search V1.1

/search/

    - in: query
      required: true
      type: string
      name: postal_code
      description: 'The postal code to find services for.'    

Consider location definitions added to HSDS V1.2:

    - in: query
      required: false
      type: string
      name: location_type
      description: 'The location type of the resource, either local or national, default searches everything.' 
      enum:
        - local
        - national

Closing Thoughts

Using OpenAPI in this way allows me to make ALISS feedback machine readable. It is something I did with AIRS, ICarol, and Health Leads with the HSDA 1.0 release.

While I can't use everything, I did get a couple of things I think should be added into HSDA 1.3 and HSDA Search 1.1. I can take the parameters as defined as part f ALISS OpenAPI, and add to my HSDA 1.3 and HSDA Search 1.1 OpenAPI drafts.

Next, I'll move to profile @rasmus-storjohann-PG Search API master story #163, which like the ALISS one, would be nice if there was an OpenAPI (fka Swagger) present already, but I'll take a crack at creating based upon the design. Something that will get increasingly more difficult with @NeilMcKechnie feedback, and others.

kinlane commented 5 years ago

One more set of road map suggestions we might want to consider as part of the next release across two HSDA definition, which I extracted from this thread is @rasmus-storjohann-PG breakdown of the secure search, based upon @NeilMcKechnie preference for the POST > Body powered search over parameter.

I think @rasmus-storjohann-PG vision for /services/secure_search/ needed fleshing out and consideration, so I riffed on in in this way, adding a secure-search for the top 3 resources using a POST w/ body submission.

/organizations/secure-search/:
 post:
    summary: Organizations Secure Search
    description: Search for organizations using a POST of an organization definition using the body.
    operationId: secureSearchOrganizations
    parameters:
      - in: body
        name: body
        schema:
          $ref: '#/definitions/organization_complete'
    responses:
      '200':
        description: Service Response
        schema:
          type: array
          items:
            $ref: "#/definitions/organization_complete"   
/locations/secure-search/:
 post:
    summary: Locations Secure Search
    description: Search for locations using a POST of an location definition using the body.
    operationId: secureSearchLocations
    parameters:
      - in: body
        name: body
        schema:
          $ref: '#/definitions/location_complete'
    responses:
      '200':
        description: Service Response
        schema:
          type: array
          items:
            $ref: "#/definitions/location_complete"              
/services/secure-search/:
 post:
    summary: Services Secure Search
    description: Search for locations using a POST of an location definition using the body.
    operationId: secureSearchServices
    parameters:
      - in: body
        name: body
        schema:
          $ref: '#/definitions/service_complete'
    responses:
      '200':
        description: Service Response
        schema:
          type: array
          items:
            $ref: "#/definitions/service_complete"

I used the "complete" representation for the request and response definition. Allowing for the most robust secure search as possible. Adding this secure search layer to HSDA.

Here is an OpenAPI for this HSDA Secure Search road map suggestion.

kinlane commented 5 years ago

I set out to create an OpenAPI representation of @rasmus-storjohann-PG Search API master story., except I feel it covers a lot of what has been covered here, except:

Let me know any other areas I missed. I think the current query, queries, and HSDA resource search (/organizations, /locations, /services) and HSDA Search cover most bases.

kinlane commented 5 years ago

Borrowing from @rasmus-storjohann-PG Service search: sort by proximity #170:

We already have /locations/{location_id}/services/ and /locations/{location_id}/services/{service_id}/ which provide access to half of the services_at_location question in HSDA.

However, we were missing the other half of the coin in HSDA, so I propose we add /services/{service_id}/locations/ and /services/{service_id}/locations/{location_id}/ to complete the services_at_location circle of life - here is the OpenAPI for it.

kinlane commented 5 years ago

Borrowing from @rasmus-storjohann-PG Service search: sort by proximity #170 as well as @NeilMcKechnie regarding more precision regarding types, I recommend the following parameters be added to:

HSDA /organizations, /locations, /services and /organizations/complete/, /locations/complete/, /services/complete/ - as define defined in this OpenAPI:

    - in: query
      name: latitude
      required: true
      type: number
      format: float
      description: 'The latitude to search by.'
    - in: query
      name: longitude
      required: true
      type: number
      format: float
      description: 'The longitude to search by.'   
    - in: query
      name: radius        
      required: true
      type: integer
      format: int32
      description: 'The radius to search by.'   

I liked the proximities search but we could get more precise with more precise parameters, and data types.

kinlane commented 5 years ago

Borrowing from @rasmus-storjohann-PG Service search: sort by proximity #170 as well as @NeilMcKechnie regarding more precision regarding types, I recommend the following parameters be added to:

HSDA Search /search/ - as define defined in this OpenAPI:

    - in: query
      name: latitude
      required: true
      type: number
      format: float
      description: 'The latitude to search by.'
    - in: query
      name: longitude
      required: true
      type: number
      format: float
      description: 'The longitude to search by.'   
    - in: query
      name: radius        
      required: true
      type: integer
      format: int32
      description: 'The radius to search by.'   

I liked the proximities search but we could get more precise with more precise parameters, and data types.

kinlane commented 5 years ago

From @greggish reference to ALISS above I am adding postal_code to the search layer to HSDA for /organizations, /locations, /services and /organizations/complete, /locations/complete, /services/complete:

    - in: query
      required: true
      type: string
      name: postal_code
      description: 'The postal code to search a specific resource for'

You can view the OpenAPI for it here - https://gist.github.com/kinlane/d29bcb807f9a4cf8c66487430845b754

kinlane commented 5 years ago

From @greggish reference to ALISS above I am adding postal_code to the search layer to HSDA Search for /search:

    - in: query
      required: true
      type: string
      name: postal_code
      description: 'The postal code to search a specific resource for'

You can view the OpenAPI for it here - https://gist.github.com/kinlane/f13133d71a27c0cfe322d07fa187ed0bhttps://gist.github.com/kinlane/d29bcb807f9a4cf8c66487430845b754

kinlane commented 5 years ago

Adding programs and contacts to search collection

search:
  properties:
    organizations:
      type: "array"
      items:
        $ref: "#/definitions/organization"
    locations:
      type: "array"
      items:
        $ref: "#/definitions/location"
    services:
      type: "array"
      items:
        $ref: "#/definitions/service"
    contacts:
      type: "array"
      items:
        $ref: "#/definitions/contact"
    programs:
      type: "array"
      items:
        $ref: "#/definitions/program"  

Then adding a parameter for selecting the types of resources to return.

    - in: query
      type: string
      name: type
      description: The type of resource to return.
      default: all
      enum:
        - all
        - organization
        - location
        - service
        - contact
        - program   

Allowing more control over the search results for any HSDA search - the OpenAPI for this feature addition is here - https://gist.github.com/kinlane/00765ecefb966e4fec799b4d66483f48

kinlane commented 5 years ago

Now that we have HSDA taxonomies for managing the taxonomy, and the ability to add, view, and remove taxonomies from a service. We need to be able to search for the across core resources using the name of taxonomy or the taxonomy id:

    - in: query
      required: true
      type: string
      name: taxonomy_ids
      description: 'Comma separated list of taxonomy ids to search by.'
    - in: query
      required: true
      type: string
      name: taxonomies
      description: 'Comma separated list of taxonomies to search by.'  

This returns all organizations, locations, and services by the services they have taxonomies applied to. You can see the OpenAPI for this here - https://gist.github.com/kinlane/ba0de5614f3d2433cf77f0604b77e85a

kinlane commented 5 years ago

Now that we have HSDA taxonomies for managing the taxonomy, and the ability to add, view, and remove taxonomies from a service. As well as the ability to search across core resources using taxonomy, we need to add the same to HSDA search.

    - in: query
      required: true
      type: string
      name: taxonomy_ids
      description: 'Comma separated list of taxonomy ids to search by.'
    - in: query
      required: true
      type: string
      name: taxonomies
      description: 'Comma separated list of taxonomies to search by.'  

Allowing the search across all organizations, locations, services, and programs using taxonomies - https://gist.github.com/kinlane/2042c0b2e6a50cddf0039d4bb144cb50

NeilMcKLogic commented 5 years ago

Hi @kinlane , quite a flurry of mostly long posts today. Would it be possible to synthesize all of this into a newly revised document or draft specification for the many of us with continued interest? Or maybe these various subtopics need to each be broken out into their own discussion items?

kinlane commented 5 years ago

Yes, each area has it's own OpenAPI, and individual issue, with all aggregated here - https://github.com/openreferral/api-specification/issues/84

Working on draft OpenAPI for HSDA v1.3 and HSDA Search v1.1, but had to put down after sprint.

switzersc commented 5 years ago

@kinlane can you point me to the issue or an example that has service_at_locations as a top level JSON object in the response? We're currently working on our search API and have found that service_at_location is most consistent with the data we have and semantics we're adapting from other systems. Right now we're planning to return a search response with service_at_locations collection as a top level object along with locations and services collections, which will contain the services and locations linked to from the service_at_locations. Does that make sense? Where's the best place to continue a conversation about that idea?

greggish commented 5 years ago

@switzersc check out #87?

kinlane commented 3 years ago

Ok. New search OpenAPI is ready for comment. We can start new thread - https://www.postman.com/api-evangelist/workspace/open-referral-human-services-data-api-hsda/documentation/35240-e8cdaa8c-5444-4722-a6fd-181f120d49f3