Modified works browse page and TextAPI to use search_after for pagination

damisul commented 3 months ago

Updated website search I've removed slider used to navigate among pages and added 'next page' / 'previous page' / 'first page' buttons instead. Also I've reimplemented navigation by first letter. Now it will work like additional filter 'Title starts with X'.

As usual there are some i18n strings which needs to be translated to Hebrew.
Updated TextAPI::search endpoint to use search_after
Also fixed error happening when we did a fulltext search on works with lenght exceeding 1000000 characters (added max_analyzed_offset: 1000000 option to fulltext query)

abartov commented 3 months ago

Could you write up a couple of paragraphs (in English, of course) of explanation for our API users of the change, with an example or two (where before you did X, now you would need to do Y), to help them migrate? I want to translate and send this to our API users and give them a chance to prepare before deploying this into production.

damisul commented 3 months ago

Could you write up a couple of paragraphs (in English, of course) of explanation for our API users of the change, with an example or two (where before you did X, now you would need to do Y), to help them migrate? I want to translate and send this to our API users and give them a chance to prepare before deploying this into production.

Ok, sure.

With our works database growth we've faced some technical problems affecting our SearchAPI (/api/v1/search endpoint). Main issue was the way we did pagination. All works matching filter are splitted to pages by 25 works per page (or less if this is the last page).

In previous version our API had a parameter named page, so we could easily get Nth page by issuing request like

{
  "key": "API_KEY",
  "view": "basic",
  "file_format": "html",
  "snippet": false,
  "page": 5,
  "sort_by": "alphabetical",
  "sort_dir": "default",
  "genres": [ "poetry" ],
  < Other filters >
}

Such requests would returned us works with index from 101 to 125.

Unfortunately such logic produces heavy load on our database and we cannot support it anymore.

Instead in v1.1 of API we introduced new parameter search_after to replace page. It works as follows:

To get first page of query you don't need to specify it at all (or can pass null here) and simply specify filtering and sorting:

{
"key": "API_KEY",
"view": "basic",
"file_format": "html",
"snippet": false,
"search_after": null,
"sort_by": "alphabetical",
"sort_dir": "default",
"genres": [ "poetry" ],
< Other filters >
}

It will return object with following structure:

{
  "total_count": <Total number of works matching criteria>,
  "next_page_search_after": ["XXX", "YYY"],
  "data": [ <First 25 works matching filter> ]

Here you can see a new response attribute next_page_search_after of Array type. If it is present, this means that there are more records exists. And to get next page you need to repeat search request with same parameters, but pass value returned in response.next_page_search_after as request.search_after value. And you can continue this process until you receive response with next_page_search_after == null, which means that last page was hit.

abartov commented 3 months ago

(deployed on staging)

damisul commented 2 months ago

Hey @abartov , I've just rebased this branch on master (to include linters).

I've tuned linter settings a bit and resolved all warning they produces in this branch.

abartov / bybeconv

Modified works browse page and TextAPI to use search_after for pagination #298