oData v4, server side paging not properly implemented

OpenUI5 version: 1.100.0

I have looked into the source code of ui5 and came to the conclusion that server side paging is not implemented correctly. All Fiori apps with the grows parameter are potentially showing wrong data. S/4Hana, everything.

How UI5 should work: Controls should not send $skip and $top=100 but set the http header Prefer: odata.maxpagesize=100. The server then returns the requested number of records plus a @odata.nextLink property with which the client can fetch the next set of records. If the nextLink returns an error, e.g. it was executed several hours later and the server no longer knows the state of this execution, it should reload all data or return an error in the control.

How UI5 does work: While it honors the nextLink, it neither sets the maxpagesize nor omits the $skip/$top parameters. So if the control requests the first set of data it sets $skip=0 and $top=100 and if the server's default pagesize happens to be less than 100 rows, UI5 does use the nextLink to get the next rows until it fetched all 100 rows. So in a sense this is a mixture of client and server side paging. Client side paging to fill the control page-wise with data and server side within one client page - except that no server will have a default maxpagesize as low as 100. Hence server side paging is essentially not used.

What's wrong with $skip/$top? When setting these parameters you ask the server to return this subset and then the query is completed. So server side paging will never kick in and the overall result shown in the control might be wrong! Imagine the oData service reads from a database. Per SQL standard the resultset has no guaranteed order unless order-by is used and it returns the data as of the time the query was started. Okay, so let's assume we require a order by clause always for client side paging to work. It is not enforced, not documented but let's assume that for now.

Scenario 1:

Request 1: select * from employees order by lastname offset 0 limit 100;
A user executes insert into employees (lastname) values ('AAAA');
Request 2: select * from employees order by lastname offset 101 limit 100;

The request 2 will return the last record (rownum 100) in the previous query again as the ordered dataset got one more record meanwhile.

That is the reason the nextLink exists. The server would have kept the query open and does simply fetch the next rows from the still open query.

Now you might say that with $top/$skip the server could keep the query open as well, just like with the nextLink. But...

Scenario 2 with two parallel screens, display A and display B:

Request 1A: select * from employees order by lastname offset 0 limit 100;
A user executes insert into employees (lastname) values ('AAAA');
Request 1B: select * from employees order by lastname offset 0 limit 100;
Request 2B: select * from employees order by lastname offset 101 limit 100;
A user executes insert into employees (lastname) values ('BBBB');
Request 2A: select * from employees order by lastname offset 101 limit 100;

As there is no state, 2A and 2B do not know to which statement they belong. It could be 1A or 1B. Or it could be a request executed many hours ago. Only the nextLink provides such state. 1A would have a nextLink=http://...$skiptoken=1A and 2A would request the next set with this URL.

Hello @wernerdaehn

I've created an internal incident 2270047198. The status of the issue will be updated here in GitHub.

Best Regards, Tsanislav

Hi @wernerdaehn,

let me answer this in a slightly different order.

What's wrong with $skip/$top? When setting these parameters you ask the server to return this subset and then the query is completed. So server side paging will never kick in and the overall result shown in the control might be wrong! Imagine the oData service reads from a database. Per SQL standard the resultset has no guaranteed order unless order-by is used and it returns the data as of the time the query was started. Okay, so let's assume we require a order by clause always for client side paging to work. It is not enforced, not documented but let's assume that for now.

Please see 11.2.6.3 System Query Option $top!

If no unique ordering is imposed through an $orderby query option, the service MUST impose a stable ordering across requests that include $top.

There is a similar statement also in the OData V2 specification.

If the data service URI contains a $top query option, but does not contain an $orderby option, then the entities in the set MUST first be fully ordered by the data service. Such a full order SHOULD be obtained by sorting the entities based on their EntityKey values. While no ordering semantics are mandated, a data service MUST always use the same semantics to obtain a full ordering across requests.

While the OData spec does not tell you how the stable sort order is imposed, it typically means that the order by clause you mention has to be added to SQL statements.

Scenario 2 with two parallel screens, display A and display B:

Request 1A: select * from employees order by lastname offset 0 limit 100;

A user executes insert into employees (lastname) values ('AAAA');

Request 1B: select * from employees order by lastname offset 0 limit 100;

Request 2B: select * from employees order by lastname offset 101 limit 100;

A user executes insert into employees (lastname) values ('BBBB');

Request 2A: select * from employees order by lastname offset 101 limit 100;

As there is no state, 2A and 2B do not know to which statement they belong. It could be 1A or 1B. Or it could be a request executed many hours ago. Only the nextLink provides such state. 1A would have a nextLink=http://...$skiptoken=1A and 2A would request the next set with this URL.

As OData is a stateless protocol, it is to be expected that there is no state information available. 11.2.6.7 Server-Driven Paging does not look to me as if the introduction of state is the purpose of server-driven paging.

Scenario 1:

Request 1: select * from employees order by lastname offset 0 limit 100;

A user executes insert into employees (lastname) values ('AAAA');

Request 2: select * from employees order by lastname offset 101 limit 100;

The request 2 will return the last record (rownum 100) in the previous query again as the ordered dataset got one more record meanwhile.

That is the reason the nextLink exists. The server would have kept the query open and does simply fetch the next rows from the still open query.

Now you might say that with $top/$skip the server could keep the query open as well, just like with the nextLink. But...

My understanding is that SAP provided backends use server-driven paging as a protection against bad requests that would cause too much load. I would also not assume that the database transaction is kept open. And if it is kept open, the timeframe for which it is may not be long. After all, the behavior you envision here only works if the database keeps a snapshot for the open transaction alive.

Having said that, yes uncoordinated editing of the same data by different persons is a challenge. The scenario you lay out here has also been discussed internally. However, I am not aware that we ever got a report of this happening (or being an issue) in productive usage.

How UI5 should work: Controls should not send $skip and $top=100 but set the http header Prefer: odata.maxpagesize=100. The server then returns the requested number of records plus a @odata.nextLink property with which the client can fetch the next set of records. If the nextLink returns an error, e.g. it was executed several hours later and the server no longer knows the state of this execution, it should reload all data or return an error in the control.

How UI5 does work: While it honors the nextLink, it neither sets the maxpagesize nor omits the $skip/$top parameters. So if the control requests the first set of data it sets $skip=0 and $top=100 and if the server's default pagesize happens to be less than 100 rows, UI5 does use the nextLink to get the next rows until it fetched all 100 rows. So in a sense this is a mixture of client and server side paging. Client side paging to fill the control page-wise with data and server side within one client page - except that no server will have a default maxpagesize as low as 100. Hence server side paging is essentially not used.

The advantage of using $skip and $top is that the client requests exactly the data it needs. In your proposal, the backend would decide how much data is read from the database and sent to the client. This may be too much (which has a small performance penalty in the form of the end-to-end request time but also leads to unnecessary load on the database) or too little (which means that the client needs to raise several sequential requests to get the data required. This has a huge impact on the response time.). Our implementation of server-driven paging ensures that our applications do not break if the server sends less data than requested. It is, at least for me, not thought to replace the client-driven paging.

Also having read your examples my impression is that you are more interested in moving to a stateful protocol. There is a mechanism with a SAP-ContextId header implemented in the OData V4 Model, see https://github.com/SAP/openui5/blob/master/src/sap.ui.core/src/sap/ui/model/odata/v4/lib/_Requestor.js#L598 and other places in _Requestor. This is used for SAP applications that really need to work with the same ABAP session. If you need to introduce state, I propose to use a custom header for your requests through https://sapui5.hana.ondemand.com/#/api/sap.ui.model.odata.v4.ODataModel%23methods/changeHttpHeaders.

Best regards Mathias.

Thanks Mathias, definitely food for thought. Regarding order by and top - understood. I missed that before.

Regarding stateless or not, the terminology and how it is implemented is secondary. I have a different impression, however. The server could read all data, cache it with using the skiptoken as key and each subsequent call reads the data from that cache. This odata server implementation must have failsafes so it does not read billions of rows and keeps them in the cache forever. But in the context of UI5 I do not see that as a problem. You will never be able to display that many records in the UI. Caching up to 10'000 rows of data just in case for a limited time would avoid keeping the session open for a longer amount of time.

In regards to "advantage of $skip/$top" I would say there is no difference to the nextLink. Instead of specifying $top, you would provide the same information via the maxpagesize. see 8.2.8.5 Preference odata.maxpagesize in http://docs.oasis-open.org/odata/odata/v4.0/errata03/os/complete/part1-protocol/odata-v4.0-errata03-os-part1-protocol-complete.html#_Toc453752234 And the $skip is replaced by the skipToken.

Your idea of using SAP-ContextId headers, stick to $skip/$top and use that to know who requested what, is interesting. Will investigate that.

The main differences between to two options are

is there the danger of getting duplicates or missing out data or inconsistent states.
Amount of time between fetches

With skip/top 1) is a problem but 2) is not. You can even request the data 8 hours later and it will be there. But 1) might be work'ed around using the ´SAP-ContextID` header. With maxpagesize 1) is solved but 2) is a problem as you cannot keep the data or connection for a long time just in case. Only option to solve that is to send an error and the ODataModel reads all data it had read again to make sure the data is consistent.

Looking forward to your reply, greatly appreciate you rtime, and I will try the SAP-ContextId approach.

Hi @wernerdaehn ,

Regarding stateless or not, the terminology and how it is implemented is secondary. I have a different impression, however. The server could read all data, cache it with using the skiptoken as key and each subsequent call reads the data from that cache. This odata server implementation must have failsafes so it does not read billions of rows and keeps them in the cache forever. But in the context of UI5 I do not see that as a problem. You will never be able to display that many records in the UI. Caching up to 10'000 rows of data just in case for a limited time would avoid keeping the session open for a longer amount of time.

An application may decide to bind a collection for which millions of records exist in the database. You are right that the user will never want to see all of them in the UI. However, using a scrollbar (e.g. in the sap.ui.table.Table) the user may go to any arbitrary place in the collection. And then we need to fetch those records in the new viewport (using $skip). Getting back to the use case: The AnalyticalTable with the visual grouping is exactly aiming at use cases where millions of records need to be accessible to the end user.

In regards to "advantage of $skip/$top" I would say there is no difference to the nextLink. Instead of specifying $top, you would provide the same information via the maxpagesize. see 8.2.8.5 Preference odata.maxpagesize in http://docs.oasis-open.org/odata/odata/v4.0/errata03/os/complete/part1-protocol/odata-v4.0-errata03-os-part1-protocol-complete.html#_Toc453752234 And the $skip is replaced by the skipToken.

Your idea of using SAP-ContextId headers, stick to $skip/$top and use that to know who requested what, is interesting. Will investigate that.

The main differences between to two options are

is there the danger of getting duplicates or missing out data or inconsistent states.

Amount of time between fetches

With skip/top 1) is a problem but 2) is not. You can even request the data 8 hours later and it will be there. But 1) might be work'ed around using the ´SAP-ContextID` header. With maxpagesize 1) is solved but 2) is a problem as you cannot keep the data or connection for a long time just in case. Only option to solve that is to send an error and the ODataModel reads all data it had read again to make sure the data is consistent.

Looking forward to your reply, greatly appreciate you rtime, and I will try the SAP-ContextId approach.

There is only a difference if the server can hold the state of the request with server-driven paging. And there are many practical issues around that. Let me just name two.

Holding the state would typically mean to keep the database cursor / database transaction open. This costs on the database side and would hence not be possible for more than few minutes, at most, in productive scenarios.
Being able to scale with server load, typically means that there is more than one server instance. The database transaction would be hold by a specific server instance, though. To get back to that server instance requires to add state to a stateless protocol.

In reality the consequence is that the amount of time between fetches has to be so small that the client would typically need to fetch all the data at once. If that is acceptable, then I propose to just fetch the few hundred records in one request. Once the data volume is bigger and fetching everything in one request is no longer sensible, there will be no difference between client- and server-driven paging.

Let me just add one thing about state: I am not an expert for cloud operations. But it seems that state is in the way of effective load management in the cloud as stateful user requests need to end up always at the same OData server instance.

Best regards Mathias.

Okay, so what is the conclusion? It seems to me that everything I requested is possible today. A stateless approach is what we have today. A stateful approach can be implemented in the server via the contextid http header, if the UI is made aware of it. And if using the skip/top or nextLink for paging is secondary then.

Only one point is obvious to me: You do not support server side paging in UI5. It is overshadowed by the client side paging. The UI5 default grows parameter is 100 and no server will produce a nextLink unless at least 1000 rows are requested in one call.

The server obviously has problems when implementing a stateful service as it does not know how much data the UI will consume ultimately. If the UI would request up to 1000 rows (via ten pages) it would know more. It obviously is no problem if the query itself returns less than 1000 rows.

With my proposal of using the nextLink as paging option over the skip/top, the server would know. The UI would tell a top=1000 and read up to that number of rows with ten pages each 100 rows. But not all servers do support server side paging - and for a good reason.

Hence I would say we have the following options:

It's the server's problem. There might be data inconsistencies but that's can't helped no matter what we do. No change in the UI5 lib.
The $metadata has the information that a service provides a nextLink for paging and the UI does use it for paging over skip/top. UI5 lib picks which one to use. With nextLink the $top tells the maximum number of rows to read, without nextlink no stateful paging.
UI sets the http header and has the option to tell how many rows are requested at max. A model bound to the List control would need a few 100, an AnalyticsTable millions. The UI provides that information, e.g. via yet another http header in parallel to the maxpagesize. No change in the UI5 lib, it is set in the ODataModel.

I would prefer some automatism in the UI5 lib. But 3) is fine as well.

If you concur with 3) we can close this case. It was just an idea on how to make UI5 even better.

PS: In regards to cloud operations, that might not be a problem. A consumer must authenticate itself against a service anyhow, so there is some state. This can be a JWT token or a https session. LoadBalancers can use that information to prefer one application server instance over the other and trigger a session migration if it must. But even without, there could be logic in the UI5 library like prefer server side paging and when the nextLink returns an error, request the data again using a $skip from the next server.

Hi @wernerdaehn ,

Hence I would say we have the following options:

It's the server's problem. There might be data inconsistencies but that's can't helped no matter what we do. No change in the UI5 lib.

The $metadata has the information that a service provides a nextLink for paging and the UI does use it for paging over skip/top. UI5 lib picks which one to use. With nextLink the $top tells the maximum number of rows to read, without nextlink no stateful paging.

UI sets the http header and has the option to tell how many rows are requested at max. A model bound to the List control would need a few 100, an AnalyticsTable millions. The UI provides that information, e.g. via yet another http header in parallel to the maxpagesize. No change in the UI5 lib, it is set in the ODataModel.

I would prefer some automatism in the UI5 lib. But 3) is fine as well.

If you concur with 3) we can close this case. It was just an idea on how to make UI5 even better.

I am not sure 3) is really the answer. I fear the answer is state and state has drawbacks. The question is really in which scenario the paging issue becomes relevant. In SAP applications we typically have list reports and object pages. In the list reports the issue of records being created or deleted by other users and thus some records may be skipped and not displayed while others may be shown twice can clearly occur*. But is it problematic? If the user does not find a specific record, the user could use the search (and thus also start anew with the list binding content). The actual editing happens typically in object pages. The case that two users edit the same document at the same time requires a collaboration model going beyond what is discussed in this issue.

* - If the $count is requested as well, the application may use a changing $count to trigger a refresh of the list binding using sap.ui.model.odata.v4.Context#requestSideEffects. This will not get 100 % of the cases, though. Another option could be to use a WebSocket connection to get notified by the server if the list changes. But I have no clue how much effort it would be to provide this on the server side.

PS: In regards to cloud operations, that might not be a problem. A consumer must authenticate itself against a service anyhow, so there is some state. This can be a JWT token or a https session. LoadBalancers can use that information to prefer one application server instance over the other and trigger a session migration if it must. But even without, there could be logic in the UI5 library like prefer server side paging and when the nextLink returns an error, request the data again using a $skip from the next server.

Having a state on the server also prevents you from shutting down the application server instance and starting another instance on another (virtual or physical) server with more or less hardware available. State costs flexibility when handling the load.

As proposed I am also closing the issue.

Best regards Mathias.

Thanks Mathias, was a pleasure talking to you!

SAP / openui5

oData v4, server side paging not properly implemented #3487