openreferral / api-specification

This is the working repository for Open Referral's Human Services Data API protocols.
https://openreferral.readthedocs.io/en/latest/hsda/
Other
29 stars 13 forks source link

Search parameters - type ambiguity, incomplete parameter list and other concerns #62

Closed NeilMcKLogic closed 3 years ago

NeilMcKLogic commented 6 years ago

Hey @kinlane , related but different than https://github.com/openreferral/api-specification/issues/46

I have some major concerns with the current definition of the Search parameters found here: https://openreferral.github.io/api-specification/definition/yaml/ and here: https://openreferral.github.io/api-specification/definition/

Some are being addressed in other topics like including more record types than just Services, and being able to control what subrecords and fields are return.

Here are my other issues:

  1. The Response should be an array of objects (not a single object) and also needs to include a Total Count of found results.
  2. The search parameters are currently typed as "string". This is a recipe for frustration and inoperability. Can we please strengthen the typing on at least these record types, and can you borrow from the HSDS spec which has more strongly typed some of these: email, lat_lng, radius (decimal plus unit type), service_area (HSDS desparately needs some standard for this too)
  3. It is not clear if the parameters should be ANDed or ORed together in searching. AND is probably most logical but the spec ought to assert either which method to use or give a mechanism for the search to choose.
  4. Need the ability to "search all fields", Google-style. In other words, the existing parameters look just like filters. What is one actually searching for? there should be a parameter for "search_term" that would be equivalent to what one would type into the single box on the Google home page.
  5. One should be able to express that the search from #4 above is looking just in specified record types, and be able to express multiple such types.
  6. No way to differentiate if you are looking for results that are physically within a specified location (e.g. Dade County" or that serve the specified location.
  7. No way to express how you want the results sorted (alphabetical, proximity)
  8. Adding to item 2 above, rather than just have a "location" parameter that is a string, why not allow a proper HSDS Address object to be passed in, which would be alot more precise? A lot of thought went into that object definition by the people on the standards team.
  9. Adding to 4 and 5 above, one should be able to confine the search to just the names of designated record types (Organization, Service, Location, Contact come to mind, searching in one or more of these record types but just their "name" fields, which is very common).

Thoughts?

kinlane commented 6 years ago

I feel good about everything here. I'm going to propose the search gets broken off into its own specification. My belief is that core resources (organizations, locations, services) should be HSDA, then we have HSDA search. As I've expressed before I think search should be it's own first class resource, but is one that spans all the core resources. I will be breaking this list down and see what I can work into in a v1.0 of the HSDA Search specification. I'd like to introduce HAL or JSON API into the search specification which would allow me to not break HSDA support for HSDS plain (CSV, XML, JSON), and give a robust envelope to support and return everything you list.

NeilMcKLogic commented 6 years ago

Sounds good, search is a big topic.

rasmus-storjohann-PG commented 6 years ago

@NeilMcKechnie could you expand on your point number 8? How would we pass in an Address to the API? What about the spec of the Address entity makes it particular suitable for this use?

NeilMcKLogic commented 6 years ago

Hi @rasmus-storjohann-PG - we could pass in a well-structured Address object as, say, JSON in the request body. This object's structure, with different fields broken out - is superior to just a free-text "location" string because it can more easily be parsed and therefore acted upon programmatically.

As a silly example, imagine the difference between getting these two as search parameters, if you're trying to find all resource records that serve a location:

"location" : "123 West Second Street Apartment 4 Baton Rouge LA 10101"

or

"address1" : "123 West Second Street" "address2" : "Apartment 4" "city" : "Baton Rouge" "county" "Independence" "stateprovince" : "LA" 'postalcode" : "10101"

rasmus-storjohann-PG commented 6 years ago

So if I understand correctly, if I wanted to find services either at a given address or close to the address, I could pass the address in in full as you suggest? I can see how doing so in a JSON body might be preferable to passing it in the URL, but it seems like it might be fragile either way, since I think differences in white space, punctuation and abbreviations could easily throw the search off. I would have thought that passing in the location_id would be the way to go.

This has also got me thinking that the latitude/longitude attributes might rightly belong on the address entities, rather than the location where it is currently. For instance if a service provider moved from one address to another, both their address and lat/long values would change, so if the lat/long was on the address, this would naturally happen in one change.

rasmus-storjohann-PG commented 6 years ago

@NeilMcKechnie in planning the proximity search, we came up with this. It ended up quite different from what I thought we'd get, I'd be very interested in your thoughts on it.

kinlane commented 5 years ago

Considering @NeilMcKechnie original list again:

- The Response should be an array of objects (not a single object) and also needs to include a Total Count of found results.

Hard to know exactly which response is being referenced here -- sadly the links above have been moved.

- The search parameters are currently typed as "string". This is a recipe for frustration and inoperability. Can we please strengthen the typing on at least these record types, and can you borrow from the HSDS spec which has more strongly typed some of these: email, lat_lng, radius (decimal plus unit type), service_area (HSDS desparately needs some standard for this too)

Yes, always the goal of refining versions. I think we can bring many relevant fields out and begin getting more precise with the type -- OpenAPI allows for type: string, and the format: to further refine. Let's talk general query parameters for HSDA and HSDA Search as separate thread. Let's work to bring it out in the search interface and make as precise as possible.

Also I feel that the Secure Search POST > Body helps provide one possible path to consider for search, beyond query parameters, and paths made available.

- It is not clear if the parameters should be ANDed or ORed together in searching. AND is probably most logical but the spec ought to assert either which method to use or give a mechanism for the search to choose.

This is where we get into GraphQL and other query language territory -- more research warranted. We don't want to reinvent the wheel, and draw the line between simple, and a professional query language layer for serious consumers. I'm adding as added feature, but right now the queries paramter allows multiple fields to be searched, and only for AND.

- Need the ability to "search all fields", Google-style. In other words, the existing parameters look just like filters. What is one actually searching for? there should be a parameter for "search_term" that would be equivalent to what one would type into the single box on the Google home page.

I believe the queries parameter for /organizations, /locations, and /services added in HSDA v1.2 allows for designating comma delimited search key=value terms.

- One should be able to express that the search from #4 above is looking just in specified record types, and be able to express multiple such types.

Feeling like this one spans several areas here. We can address by bringing out other specific fields from across organizations, locations, services, and sub-resources, while also better defining their data types. Feel like this can be handled by targeting specific fields, and better defining the data structure within OpenAPI definition -- correct me if I'm missing anything.

- No way to differentiate if you are looking for results that are physically within a specified location (e.g. Dade County" or that serve the specified location.

We can address this separately, as part of a suggest HSDA Geo layer.

- No way to express how you want the results sorted (alphabetical, proximity)

I would put this under Sorting #12, not search (but obviously applies here, and to geo)

- Adding to item 2 above, rather than just have a "location" parameter that is a string, why not allow a proper HSDS Address object to be passed in, which would be alot more precise? A lot of thought went into that object definition by the people on the standards team.

Move this to a HSDA Geo layer -- GeoJSON? more research here as separate project.

- Adding to 4 and 5 above, one should be able to confine the search to just the names of designated record types (Organization, Service, Location, Contact come to mind, searching in one or more of these record types but just their "name" fields, which is very common).

We have this as basic query search for /organizations, /locations, /services.

kinlane commented 3 years ago

I am proposing we adopt an approach similar to GitHubs -- current model addresses most of what is here -- we can start conversation anew.