RESOStandards / transport

RESO Transport Workgroup - Specifications and Change Proposals
https://transport.reso.org
Other
18 stars 15 forks source link

RCP-039 - Web API Core 2.1.0 Specification #22

Closed darnjo closed 2 months ago

darnjo commented 2 years ago

The RESO Certification Subgroup has requested a new Web API Core specification be created to include certain features like support for OData Expand and Server-Driven paging.


Support for Expanded Data Elements


Providers MUST Support Server-Driven Paging

Providers MUST support server-driven paging using @odata.nextLink.

This functionality is needed so that data consumers may reliably consume data when only a partial result is returned and it’s classified as a MUST within OData Minimal Conformance Requirements (item 3) for version 4.0 and above.

Basic tests will be added in the Web API Core 2.1.0 tests, but the majority of nextLink testing will be done in the Payloads 2.0 Specification, which could result in a failure of nextLink-based operations even if the provider passed Web API Core 2.1.0 testing.


String Comparison Operators for Single- and Multi-Valued Enumerations

With the current Web API Core tests, both single- and multi-valued enumerations are tested for those using OData Edm.EnumType enumerations and either Collection(Edm.EnumType) or Edm.EnumType with IsFlags=true.

String-based enumerations were added to Data Dictionary 1.7+ using the Lookup resource. There is currently no way to test this case in Web API Core 2.0.0.

The following tests will be added to support this case:

EnFinlay commented 2 years ago

An authoritative list of top-level items will be created for Data Dictionary 1.7, and each subsequent version, starting with Property, Member, Office, and OpenHouse. Other items TBD.

Will this requirement be "at least one of these resources is required, and if any are present, they must be available at the top level"?

darnjo commented 2 years ago

Yes, that's correct. I'll make sure to clarify that.

We test this in Web API Core currently and at least one of Property, Member, Office, or Media is required at the top level. We'd be removing Media and perhaps adding some other top level items. Maybe OpenHouse, for example. We probably need a white list for DD 2.0/availability testing too.

gr33neggs commented 2 years ago

I know there is a pain point for @odata.nextLink involving $top

The issue resolves around the definition of $top: 11.2.6.3 System Query Option $top The $top system query option specifies a non-negative integer n that limits the number of items returned from a collection. The service returns the number of available items up to but not greater than the specified value n.

Many expect that the @odata.nextLink should continue to appear and provide the links to obtain additional records after the number of records specified in the $top have been returned.

Example: a search includes $top=10 and the server's page size is 1000. Should a @odata.nextLink be included? I am under the impression that it should not. There are no other pages required in order to obtain the 10 records requested. It should not be assumed that the user wants to pull additional matching records. Even if a search includes $top=1000 and the server's page size is 1000, the @odata.nextLink should not be included because only a single page was needed to return the requested records.

Another example: a search includes $top=1500 and the server's page size is 1000. How would @odata.nextLink function? Assuming that there are 1500 records available that match the given query: the first page will include the first 1000 of the 1500, and the @odata.nextLink should include $skip=1000 and $top=500 to obtain the final 500 of the 1500 requested records.

darnjo commented 2 years ago

Good question regarding $top, something to look into further. As a side note, if $top is combined with $skip, client-driven paging is being used and no next link would be present?

From the OData specification:

Responses that include only a partial set of the items identified by the request URL MUST contain a link that allows retrieving the next partial set of items. This link is called a next link; its representation is format-specific. The final partial set of items MUST NOT contain a next link.

The client can request a maximum page size through the maxpagesize preference. The service may apply this requested page size or implement a page size different than, or in the absence of, this preference.

OData clients MUST treat the URL of the next link as opaque, and MUST NOT append system query options to the URL of a next link. Services may not allow a change of format on requests for subsequent pages using the next link. Clients therefore SHOULD request the same format on subsequent page requests using a compatible Accept header. OData services may use the reserved system query option $skiptoken when building next links. Its content is opaque, service-specific, and must only follow the rules for URL query parts.

OData clients MUST NOT use the system query option $skiptoken when constructing requests.

I believe that when $top is used, a next link would only be present if the total number of records exceeded the requested page size. So if there are 15 records and the client asks for a top of 10, then there would be a next link for the following page, but not once the next page is fetched.

If the client wanted to specify a requested page size with next link, it seems maxpagesize, as outlined above, might be more appropriate.


On another note, we removed string queries from the list of items above in the May 2022 Certification Subgroup meeting, but need a way to reconcile the situation and add them back in. cc: @psftc @SergioDelRioT4Bi @EnFinlay

Web API Core currently tests single- and multi-enumerations, meaning that we can't currently certify anyone using the Lookup resource for enumerations until this has been addressed. This is the reason it was to be added to Core 2.1.0.

One problem I see with case insensitivity is that it would create a scenario where if there were more than one variation of something, say CA and ca, the client wouldn't be able to specifically filter for one or the other without consuming all the records and doing so on the client side. This essentially just offloads the work the server is supposed to do in that case on clients. Why make more work for them if the server is already supposed to behave correctly in this case?

In terms of what's in the payload, for any given standard enumeration, the case MUST match. If the server had both CA and ca in StateOrProvince, tests would fail. Being able to query using CA, cA, or ca makes things potentially more convenient in terms of queries, but only one value would ever be returned in this case, if it existed: CA.

Similar is true for lookup names, the server MUST have StateOrProvince, not StateorProvince or stateorprovince as a lookup name. Clients are supposed to use /Lookup?$filter=tolower(LookupName) eq 'stateorprovince' in this case if they want case insensitive searches. Perhaps that's all we test for if we decide to override the OData specification for some reason.

Depending on the server collation, the potentially more "challenging" string queries are contains,startswith, or endswith. However, OData dropped support for collations in OData 2.0 in favor of making 4.0+ case sensitive. See related Commander issue for more information. Clients are supposed to use tolower or toupper as appropriate if they want case insensitive behavior. /Lookup?$filter=startswith(tolower(LookupName), 'state').

A question I have would be: what do the current server implementations of OData expect out of the box? For example the .Net OData server? I don't work with Microsoft technologies personally, and can't answer the question. But, we don't want to get into a scenario where we relax a restriction in the OData specification that breaks existing servers or clients. I believe the Olingo server is lax in this regard.

As an aside, we wouldn't change other open standards such as HTTP or OAuth2 because we didn't like something about them? What makes OData any different? It's not our standard to make changes to.

The appropriate path in this case would be that someone from RESO would surface the issue with the OASIS OData team and ask for a change proposal. Perhaps @SergioDelRioT4Bi can take that on.

gr33neggs commented 2 years ago

a next link would only be present if the total number of records exceeded the requested page size. So if there are 15 records and the client asks for a top of 10, then there would be a next link for the following page, but not once the next page is fetched.

If the client asks for a top of 10, then only 10 items should be returned. If 10 records can fit on a single page, then no next link should be provided. This example seems to be using top in place of the maxpagesize preference. If the page size is 10, and there are 15 records that match the query, then yes a next page link should be provided.

The service should never return greater than the specified top:

11.2.6.3 System Query Option $top

The service returns the number of available items up to but not greater than the specified value n.

If the service already returned n (10), why would it give a next link? It's not supposed to return any further records.

11.2.6.7 Server-Driven Paging

Responses that include only a partial set of the items identified by the request URL MUST contain a link that allows retrieving the next partial set of items.

We can't ignore the $top portion of the request URL. The request URL specified n(10) records to return, and the response contained n(10) records. Therefore it's not a partial response. If $top is to be ignored for a specific function, then the Odata spec explicitly states so. For example $count explicitly states:

The $count system query option ignores any $top,$skip or $expand query options

clbaxter commented 2 years ago

reading through the notes, I'm not sure if this was covered...

darnjo commented 2 years ago

Thank you for commenting and for the suggestion.

There are some interesting facets to this:

There are a couple of ways we could approach the tests here. We could decide not to trust the counts, page as far as we can, and then verify that on the "last" page, there was no next link. We will need to set a page size small enough that we wouldn't consume the entire record set in one page.

Another alternative is to choose a number of records and page size that every vendor should be able to support on every resource, and ensure that the next link isn't there in that case.

The rules above take the latter approach:

A request will be made using $top=1. This should NOT contain @odata.nextLink, since one record should be available, and we’ve reached the end of the set.

Requesting one record for any resource that is being tested should always work, and we should never have more than one page in this case. People could potentially "cheat" the test by coding their servers to never return next link with one record, so we may want to add additional coverage, for instance we can try and go as far as we can, and wherever the pull stops, ensure that the last page didn't have data.

Curious to hear any thoughts.

cobogeri commented 2 years ago

There is mention in the OASIS documents System Query Option $count I found helpful:

Clients should be aware that the count returned inline may not exactly equal the actual number of items returned, due to latency between calculating the count and enumerating the last value or due to inexact calculations on the service.

darnjo commented 2 years ago

Motion to Move Specification Forward Proposed by Eric Finlay. Seconded by Paul Hethmon.

Yes: 12, No: 0, Abstentions: 3

darnjo commented 2 years ago

Even though we didn't vote for it today, I think due to the discussion about 429 requirements in https://github.com/RESOStandards/transport/discussions/31 we could add some language to the Core 2.1.0 and Payloads 2.0 specs to require providers to use a 429 when rate limiting their consumers?

We don't necessarily have to test for it, and it wouldn't necessarily be straightforward to test anyway, but we could put it in the spec?

cc: @psftc @EnFinlay @SergioDelRioT4Bi

darnjo commented 1 year ago

TODO: from 9/15 Cert meeting:

clbaxter commented 1 year ago

to the first point in > ..."I know there is a pain point for @odata.nextLink involving $top" any $top that is under the page size returned should only return that number (as a maximum) of records and NOT provide a next link. This adheres to the basic $top and assumes end of record delivery. to the second point, we have looked at this extensively as well.... when a $top is indicated that is greater than the server batch size we have seen many ODATA use comments all leaning on that the TOP in this case is to be ignored.

darnjo commented 1 year ago

Thanks. The testing rules above were approved by the Workgroups and we're in the process of creating testing tools.

gr33neggs commented 1 year ago

Since ModificationTimestamp is the only required field at this point, it will be used.

I have been digging through specifications, but I only found ModificationTimestamp listed as a MUST for Lookup resource testing in DD 1.7 and 2.0

EDIT: Nevermind, Payloads Testing Specification has it

Required Fields: All RESO Data Dictionary resources sampled MUST have a Key and ModificationTimestamp or the tests will fail.

darnjo commented 1 year ago

The key and timestamp requirements are part of the Payloads spec - RCP-038 - at the moment as part of Data Dictionary testing.

OData also requires that every resource has a key. So not having the key field on all resources will fail the metadata validation tests anywhere they're done, which includes Web API and Data Dictionary + Payloads.

darnjo commented 1 year ago

@gr33neggs - we've been considering merging DD and Payloads but they're two separate proposals and stages at the moment.

When DD 1.7 was first released, Payloads wasn't required. We added it for DD testing in August of 2022, so they could potentially be combined.

That said, there may be reasons to keep them separate. We may want to validate Payloads beyond DD testing and RESO Common Format is more of a payloads test than DD.