darnjo commented 2 years ago

The RESO Certification Subgroup has requested a new Web API Core specification be created to include certain features like support for OData Expand and Server-Driven paging.

Support for Expanded Data Elements

An authoritative list of top-level items will be created for Data Dictionary 1.7, and each subsequent version, starting with Property, Member, Office, Field, and Lookup Resources.
These items MAY be expanded into other resources, expanding Member into Property as ListAgent for example, but MUST be available at the top level.
Other items, such as Media or HistoryTransactional MAY be available at the top level but also MAY be available as expansions (or both).
Web API Core testing only requires one resource to be tested, though providers may test more than one.
Providers who support expand should specify at least one navigation property to test for a given resource.
Additional queries on expanded resources WILL NOT be tested at this time. Providers MAY support them as they see fit.
Assume the resource being tested is Property, with a navigation property of Media:
- A GET request will be made to /Property?$expand=Media and the Property payload will be checked for a collection-based property called Media, and each result in the collection will be validated against the OData EntityType advertised in the metadata for Media.
- Keys will be collected from the first request to Property, and a GET request will be made to /Property('XXXXX')/Media where XXXXX is the key.
As of Data Dictionary 1.7, standard names have been added for related RESO standard data elements. For example, here are those in the Property Resource. Providers MUST use standard names for standard expansions, when present, but MAY create their own local expansions outside of this as long as they are OData compliant.
Expanded properties MUST use the definitions in the RESO Data Dictionary, when applicable. These items have been added to the Data Dictionary Wiki (e.g. Media) and DD reference sheets, as well as generated in the reference metadata. Providers MAY add their own local expansions as well.

Providers MUST Support Server-Driven Paging

Providers MUST support server-driven paging using @odata.nextLink.

This functionality is needed so that data consumers may reliably consume data when only a partial result is returned and it’s classified as a MUST within OData Minimal Conformance Requirements (item 3) for version 4.0 and above.

Basic tests will be added in the Web API Core 2.1.0 tests, but the majority of nextLink testing will be done in the Payloads 2.0 Specification, which could result in a failure of nextLink-based operations even if the provider passed Web API Core 2.1.0 testing.

A count will be made on the given resource being tested to ensure there are records.
A request will be made using $top=1. This should NOT contain @odata.nextLink, since one record should be available, and we’ve reached the end of the set.
Several pages (10+) of records will be fetched using @odata.nextLink without any additional $filter parameters to ensure that the data returned with each response doesn't match any of the previous responses.
Next links with $filter will also be tested. Since ModificationTimestamp is the only required field at this point, it will be used.
- A $filter request will be made for records greater than one year back using the timestamp field being tested in the Web API Core tests, and several pages (10+), will be fetched to ensure that each page has an @odata.nextLink, when applicable. If the total number of records are fetched during this process, the last page will be checked to ensure there is no nextLink. This testing will be done using the greater than (gt) operator.
- The same test will be performed using the less than (lt) operator for records prior to the current timestamp.
- Each page of data fetched will be validated to ensure that the timestamp field value is greater or less than what was requested.

String Comparison Operators for Single- and Multi-Valued Enumerations

With the current Web API Core tests, both single- and multi-valued enumerations are tested for those using OData Edm.EnumType enumerations and either Collection(Edm.EnumType) or Edm.EnumType with IsFlags=true.

String-based enumerations were added to Data Dictionary 1.7+ using the Lookup resource. There is currently no way to test this case in Web API Core 2.0.0.

The following tests will be added to support this case:

Providers will supply a LookupName to test for single- and multi-valued enumerations, as well as sample values for each case. For single-valued enumerations, there will be one sample value, and for multi-valued enumerations there will be two.
Data will be consumed from the Lookup Resource and validated. The provided LookupName will be checked to ensure it’s present in the Lookup Resource data, as will the lookup values.
For single-valued enumeration tests, assume that the LookupName provided is “StandardStatus” with a sample value of “Active” - the following requests will be made, and resulting data validated to ensure it matches the given queries:
- GET /Property?$filter=StandardStatus eq 'Active'
- GET /Property?$filter=StandardStatus ne 'Active'
For multi-valued enumeration tests, assume that the LookupName provided is "AccessibilityFeatures” and the values are “Accessible Entrance” and “Visitable” - the following tests will be made and resulting data validated to ensure it matches the given queries:
- GET /Property?$filter=AccessibilityFeatures/any(enum:enum eq 'Accessible Entrance' OR enum eq 'Visitable')
- GET /Property?$filter=AccessibilityFeatures/all(enum:enum eq 'Accessible Entrance' OR enum eq 'Visitable')
As of OData “4.01” the in operator was introduced to determine whether a given value is in a set of values. This is more convenient that writing Field1 eq 'value1' OR Field1 eq 'value2' OR.... IF the response header indicates that the OData version is "4.01", then the in operator will be tested using a query similar to the following:
- GET /Property?$filter=StandardStatus in ('Active', 'Pending', 'Sold')
- TBD if we want to support in for multi-valued enumerations (using any and all)

EnFinlay commented 2 years ago

An authoritative list of top-level items will be created for Data Dictionary 1.7, and each subsequent version, starting with Property, Member, Office, and OpenHouse. Other items TBD.

Will this requirement be "at least one of these resources is required, and if any are present, they must be available at the top level"?

darnjo commented 2 years ago

Yes, that's correct. I'll make sure to clarify that.

We test this in Web API Core currently and at least one of Property, Member, Office, or Media is required at the top level. We'd be removing Media and perhaps adding some other top level items. Maybe OpenHouse, for example. We probably need a white list for DD 2.0/availability testing too.

gr33neggs commented 2 years ago

I know there is a pain point for @odata.nextLink involving $top

The issue resolves around the definition of $top: 11.2.6.3 System Query Option $top The $top system query option specifies a non-negative integer n that limits the number of items returned from a collection. The service returns the number of available items up to but not greater than the specified value n.

Many expect that the @odata.nextLink should continue to appear and provide the links to obtain additional records after the number of records specified in the $top have been returned.

Example: a search includes $top=10 and the server's page size is 1000. Should a @odata.nextLink be included? I am under the impression that it should not. There are no other pages required in order to obtain the 10 records requested. It should not be assumed that the user wants to pull additional matching records. Even if a search includes $top=1000 and the server's page size is 1000, the @odata.nextLink should not be included because only a single page was needed to return the requested records.

Another example: a search includes $top=1500 and the server's page size is 1000. How would @odata.nextLink function? Assuming that there are 1500 records available that match the given query: the first page will include the first 1000 of the 1500, and the @odata.nextLink should include $skip=1000 and $top=500 to obtain the final 500 of the 1500 requested records.

darnjo commented 2 years ago

Good question regarding $top, something to look into further. As a side note, if $top is combined with $skip, client-driven paging is being used and no next link would be present?

From the OData specification:

Responses that include only a partial set of the items identified by the request URL MUST contain a link that allows retrieving the next partial set of items. This link is called a next link; its representation is format-specific. The final partial set of items MUST NOT contain a next link.

The client can request a maximum page size through the maxpagesize preference. The service may apply this requested page size or implement a page size different than, or in the absence of, this preference.

OData clients MUST treat the URL of the next link as opaque, and MUST NOT append system query options to the URL of a next link. Services may not allow a change of format on requests for subsequent pages using the next link. Clients therefore SHOULD request the same format on subsequent page requests using a compatible Accept header. OData services may use the reserved system query option $skiptoken when building next links. Its content is opaque, service-specific, and must only follow the rules for URL query parts.

OData clients MUST NOT use the system query option $skiptoken when constructing requests.

I believe that when $top is used, a next link would only be present if the total number of records exceeded the requested page size. So if there are 15 records and the client asks for a top of 10, then there would be a next link for the following page, but not once the next page is fetched.

If the client wanted to specify a requested page size with next link, it seems maxpagesize, as outlined above, might be more appropriate.

On another note, we removed string queries from the list of items above in the May 2022 Certification Subgroup meeting, but need a way to reconcile the situation and add them back in. cc: @psftc @SergioDelRioT4Bi @EnFinlay

Web API Core currently tests single- and multi-enumerations, meaning that we can't currently certify anyone using the Lookup resource for enumerations until this has been addressed. This is the reason it was to be added to Core 2.1.0.

One problem I see with case insensitivity is that it would create a scenario where if there were more than one variation of something, say CA and ca, the client wouldn't be able to specifically filter for one or the other without consuming all the records and doing so on the client side. This essentially just offloads the work the server is supposed to do in that case on clients. Why make more work for them if the server is already supposed to behave correctly in this case?

In terms of what's in the payload, for any given standard enumeration, the case MUST match. If the server had both CA and ca in StateOrProvince, tests would fail. Being able to query using CA, cA, or ca makes things potentially more convenient in terms of queries, but only one value would ever be returned in this case, if it existed: CA.

Similar is true for lookup names, the server MUST have StateOrProvince, not StateorProvince or stateorprovince as a lookup name. Clients are supposed to use /Lookup?$filter=tolower(LookupName) eq 'stateorprovince' in this case if they want case insensitive searches. Perhaps that's all we test for if we decide to override the OData specification for some reason.

Depending on the server collation, the potentially more "challenging" string queries are contains,startswith, or endswith. However, OData dropped support for collations in OData 2.0 in favor of making 4.0+ case sensitive. See related Commander issue for more information. Clients are supposed to use tolower or toupper as appropriate if they want case insensitive behavior. /Lookup?$filter=startswith(tolower(LookupName), 'state').

A question I have would be: what do the current server implementations of OData expect out of the box? For example the .Net OData server? I don't work with Microsoft technologies personally, and can't answer the question. But, we don't want to get into a scenario where we relax a restriction in the OData specification that breaks existing servers or clients. I believe the Olingo server is lax in this regard.

As an aside, we wouldn't change other open standards such as HTTP or OAuth2 because we didn't like something about them? What makes OData any different? It's not our standard to make changes to.

The appropriate path in this case would be that someone from RESO would surface the issue with the OASIS OData team and ask for a change proposal. Perhaps @SergioDelRioT4Bi can take that on.

gr33neggs commented 2 years ago

a next link would only be present if the total number of records exceeded the requested page size. So if there are 15 records and the client asks for a top of 10, then there would be a next link for the following page, but not once the next page is fetched.

If the client asks for a top of 10, then only 10 items should be returned. If 10 records can fit on a single page, then no next link should be provided. This example seems to be using top in place of the maxpagesize preference. If the page size is 10, and there are 15 records that match the query, then yes a next page link should be provided.

The service should never return greater than the specified top:

11.2.6.3 System Query Option $top

The service returns the number of available items up to but not greater than the specified value n.

If the service already returned n (10), why would it give a next link? It's not supposed to return any further records.

11.2.6.7 Server-Driven Paging

Responses that include only a partial set of the items identified by the request URL MUST contain a link that allows retrieving the next partial set of items.

We can't ignore the $top portion of the request URL. The request URL specified n(10) records to return, and the response contained n(10) records. Therefore it's not a partial response. If $top is to be ignored for a specific function, then the Odata spec explicitly states so. For example $count explicitly states:

The $count system query option ignores any $top,$skip or $expand query options

clbaxter commented 2 years ago

reading through the notes, I'm not sure if this was covered...

search set has > 10k records
search requests a top= 1000
max page size is set to 500 I have read 2 ways to handle this.
1) return the pages (500 records at a time) with next links until the 'top' count of total records are consumed. (2 batches) 2) since the top count request is over the page size count, the top parameter is ignored and batches of the given default page size are handed out with nextlinks until the search is exhausted.

darnjo commented 2 years ago

Thank you for commenting and for the suggestion.

There are some interesting facets to this:

In practice, we're finding that the "count" that providers return can be unreliable, meaning that we don't actually know how many records to expect from a given resource or what the "last" page might be beforehand. The advertised count might be 10,000 but we can only fetch 500 records, for example.
The "$top" query option doesn't require that the provider return $top=n records, but up to n (but not more). So, a client could request $top=100 but only get 10 records per page. If 200 records came back in this case, then that would be invalid.
The Web API testing tools are meant to test any resource, and in some cases we may only have a small records to work with. So, in some cases this might be hundreds of thousands of records, and the server might support page sizes of 1000, but in other cases it might be 30 records and the server only supports a page size of 10.

There are a couple of ways we could approach the tests here. We could decide not to trust the counts, page as far as we can, and then verify that on the "last" page, there was no next link. We will need to set a page size small enough that we wouldn't consume the entire record set in one page.

Another alternative is to choose a number of records and page size that every vendor should be able to support on every resource, and ensure that the next link isn't there in that case.

The rules above take the latter approach:

A request will be made using $top=1. This should NOT contain @odata.nextLink, since one record should be available, and we’ve reached the end of the set.

Requesting one record for any resource that is being tested should always work, and we should never have more than one page in this case. People could potentially "cheat" the test by coding their servers to never return next link with one record, so we may want to add additional coverage, for instance we can try and go as far as we can, and wherever the pull stops, ensure that the last page didn't have data.

Curious to hear any thoughts.

cobogeri commented 2 years ago

There is mention in the OASIS documents System Query Option $count I found helpful:

Clients should be aware that the count returned inline may not exactly equal the actual number of items returned, due to latency between calculating the count and enumerating the last value or due to inexact calculations on the service.

darnjo commented 2 years ago

Motion to Move Specification Forward Proposed by Eric Finlay. Seconded by Paul Hethmon.

Yes: 12, No: 0, Abstentions: 3

darnjo commented 2 years ago

Even though we didn't vote for it today, I think due to the discussion about 429 requirements in https://github.com/RESOStandards/transport/discussions/31 we could add some language to the Core 2.1.0 and Payloads 2.0 specs to require providers to use a 429 when rate limiting their consumers?

We don't necessarily have to test for it, and it wouldn't necessarily be straightforward to test anyway, but we could put it in the spec?

cc: @psftc @EnFinlay @SergioDelRioT4Bi

darnjo commented 1 year ago

TODO: from 9/15 Cert meeting:

Add a MUST to the spec for 429 response when the consumer is being rate limited. The provider MAY use Retry-After but it's not required.
Add a warning if the ResourceRecordKey of an expanded item doesn't match the primary key of the record it was expanded into. For example, if Media is expanded into Property, then the Media ResourceRecordKey field should match the ListingKey in Property.

clbaxter commented 1 year ago

to the first point in > ..."I know there is a pain point for @odata.nextLink involving $top" any $top that is under the page size returned should only return that number (as a maximum) of records and NOT provide a next link. This adheres to the basic $top and assumes end of record delivery. to the second point, we have looked at this extensively as well.... when a $top is indicated that is greater than the server batch size we have seen many ODATA use comments all leaning on that the TOP in this case is to be ignored.

darnjo commented 1 year ago

Thanks. The testing rules above were approved by the Workgroups and we're in the process of creating testing tools.

gr33neggs commented 1 year ago

Since ModificationTimestamp is the only required field at this point, it will be used.

I have been digging through specifications, but I only found ModificationTimestamp listed as a MUST for Lookup resource testing in DD 1.7 and 2.0

EDIT: Nevermind, Payloads Testing Specification has it

Required Fields: All RESO Data Dictionary resources sampled MUST have a Key and ModificationTimestamp or the tests will fail.

darnjo commented 1 year ago

The key and timestamp requirements are part of the Payloads spec - RCP-038 - at the moment as part of Data Dictionary testing.

OData also requires that every resource has a key. So not having the key field on all resources will fail the metadata validation tests anywhere they're done, which includes Web API and Data Dictionary + Payloads.

darnjo commented 1 year ago

@gr33neggs - we've been considering merging DD and Payloads but they're two separate proposals and stages at the moment.

When DD 1.7 was first released, Payloads wasn't required. We added it for DD testing in August of 2022, so they could potentially be combined.

That said, there may be reasons to keep them separate. We may want to validate Payloads beyond DD testing and RESO Common Format is more of a payloads test than DD.

RESOStandards / transport

RCP-039 - Web API Core 2.1.0 Specification #22

Support for Expanded Data Elements

Providers MUST Support Server-Driven Paging

String Comparison Operators for Single- and Multi-Valued Enumerations