SoftInstigate / restheart

Rapid API Development with MongoDB
https://restheart.org
GNU Affero General Public License v3.0
805 stars 171 forks source link

paging stops after 6 pages with max page size #375

Closed sterry1 closed 4 years ago

sterry1 commented 4 years ago

Expected Behavior

expect subsequent page requests would complete. Using a plain java client with same driver version using skip and limit does not show this issue.

Current Behavior

The paging with "filter"&page=7&pagesize=1000 hangs. A single request for page="N" and same pagesize hangs as well.

Previous testing did not reveal this problem until we began trying to page thru large collections(~300M documents).

Context

we are evaluating RH ver. 4.1.6 to use in our Micro service architecture and this is a mandatory requirement. Paging large result sets from queries is a must.

We are using the RH server running in Docker containers

Environment

host OS varies Docker 18 RH oss ver. 4.1.6 Mongodb ver. 4.2

Steps to Reproduce

  1. initialize a db and collection with > than 100K documents
  2. iterate pages 1 - N, index "filedname" in documents for this collection curl to a http://host/db/rearrangement?filter={"fieldname":"value"}&page=1&pagesize=1000

Possible Implementation

based on this behavior I am assuming this is a bug that won't page past 70,000 documents in a result set.

ujibang commented 4 years ago

Hi @sterry1

while we try to replicate the issue, please try the following:

0) check who is hanging, restheart or mongodb

after sending the GET request at page=7 that hangs, try to connect with mongo shell and send a find command. Is mongodb responsive?

1) check pagesize configuration options

in the configuration file check this section and make sure that max-pagesize and cursor-batch-size are equal.

## Read Performance

# default-pagesize is the number of documents returned when the pagesize query
# parameter is not specified
# see https://restheart.org/docs/read-docs#paging
default-pagesize: 100

# max-pagesize sets the maximum allowed value of the pagesize query parameter
# generally, the greater the pagesize, the more json serializan overhead occurs
# the rule of thumb is not exeeding 1000
max-pagesize: 1000

# cursor-batch-size sets the mongodb cursor batchSize
# see https://docs.mongodb.com/manual/reference/method/cursor.batchSize/
# cursor-batch-size should be smaller or equal to the max-pagesize
# the rule of thumb is setting cursor-batch-size equal to max-pagesize
# a small cursor-batch-size (e.g. 101, the default mongodb batchSize)
# speeds up requests with small pagesize
cursor-batch-size: 1000

2) default sorting

as described in the documentation https://restheart.org/docs/read-docs/#default-sorting

The default sorting of the documents is by the _id descending.

This can impact on performances in some use case. Either specify ?sort={} query parameter to disable default sorting or create a compound index {"fieldname": 1, "_id": -1}

3) disable cursor pool preallocation

cursor pool preallocation is described at https://restheart.org/docs/speedup-requests-with-cursor-pools/

it speeds up requests in most use cases. In some cases, for instance If you are sending GET requests to subsequent pages at very high rate it can have a negative effect, since preallocation occurs in parallel with the requests

To disable it set the following in the cursor pool configuration options section:

## Eager DB Cursor Preallocation Policy

# In big collections, reading a far page involves skipping the db cursor for many documents resulting in a performance bottleneck
# For instance, with default pagesize of 100, a GET with page=50.000 involves 500.000 skips on the db cursor.
# The eager db cursor preallocation engine boosts up performaces (in some use cases, up to 1000%). the following options control its behavior.

eager-cursor-allocation-pool-size: 0

eager-cursor-allocation-linear-slice-width: 1000
eager-cursor-allocation-linear-slice-delta: 100
eager-cursor-allocation-linear-slice-heights: [0]
eager-cursor-allocation-random-max-cursors: 0
eager-cursor-allocation-random-slice-min-width: 0
sterry1 commented 4 years ago

Hi Andrea, Thanks for the quick response.

0) It is definitely RH. Mongodb is responding

1) I double and triple check my configuration settings and they match the documentation and your suggestions.

2) I ran a new subset of tests with the sort query parameter set to "{}". This allowed paging to proceed without hanging. :). I am in the process modifying the other tests to include the sort query parameter and gathering new response times and data sizes.

3) I did not disable the eager cursor preallocation as yet but will add that to my tests to determine any change in behavior.

The documentation didn't make any suggestions for sizing the JVM for optimal performance but I assume bigger is always better.

Thanks again for the quick response.