laravel / scout

Laravel Scout provides a driver based solution to searching your Eloquent models.
https://laravel.com/docs/scout
MIT License
1.54k stars 327 forks source link

Incorrect Total Count in Laravel Scout Pagination #824

Closed Evenerik closed 3 months ago

Evenerik commented 4 months ago

Scout Version

10.8.6

Scout Driver

Typesense

Laravel Version

10.48.3

PHP Version

8.3.0

Database Driver & Version

MySQL 8.0.36

SDK Version

No response

Meilisearch CLI Version

No response

Description

This is pretty much a reopening of @razvaniacob's issue, who has encountered the same problem I am currently facing. I won't go into too much detail since he described the bug perfectly well in his issue, but Scout's LengthAwarePaginator shows wrong total when using Typesense's engine

Steps To Reproduce

See steps in https://github.com/laravel/scout/issues/819

driesvints commented 4 months ago

@karakhanyans do you have any thoughts here?

karakhanyans commented 4 months ago

@driesvints will check this and get back with more info 👌

razvaniacob commented 4 months ago

I will be happy to see this solved guys. Good luck!

driesvints commented 4 months ago

@karakhanyans any news here?

AbdullahFaqeir commented 4 months ago

I'll be working on this

tamakoma1129 commented 4 months ago

Hello,

I encountered the same issue reported here. In my case, I was using a model with a ULID as the primary key but initially forgot to set protected $keyType = 'string'; in my model. After updating this setting, I can confirm that the pagination now works correctly.

Here are the specifics of my development environment:

Apologies if this solution has already been mentioned. I hope this helps someone else experiencing the same problem.

Thank you.

driesvints commented 3 months ago

Ping @jasonbosco

jasonbosco commented 3 months ago

@AbdullahFaqeir will be working on this going forward.

AbdullahFaqeir commented 3 months ago

Hi @Evenerik!

Can you please dump the collection schema, and sample record in typesense and the same record in database.

Below you can see a dump of my env showing a working fresh sample, I've seeded 10K users.

User model settings.

screenshot 7

Search performed.

screenshot 8

Paginated search results.

screenshot 6
driesvints commented 3 months ago

Closing this issue because it's inactive, already solved, old or not relevant anymore. Feel to open up a new issue if you're still experiencing this.

razvaniacob commented 2 months ago

Is it possible to reopen this because the problem persists. I tracked it down to the method getTotalCount on Builder class https://github.com/laravel/scout/blob/10.x/src/Builder.php#L486

the $totalCount is correct but because of $this->queryCallback not being null, the total returned is the perpage number.

I don't really understand what is happening after that if statment `if (is_null($this->queryCallback)) { return $totalCount; } @driesvints Do I need to open a new issue about this?

jasonbosco commented 2 months ago

Could you share the information that @AbdullahFaqeir asked for here to debug this further: https://github.com/laravel/scout/issues/824#issuecomment-2131359416

razvaniacob commented 2 months ago

I'm happy to share more information. I've notice that things stop working as they should when using a callback query indicated in the documentation here: https://laravel.com/docs/11.x/scout#customizing-the-eloquent-results-query

The ecpected number of results should be 512 but they are limited to the number per page.

City model settings

Screenshot 2024-06-13 at 6 49 13 PM

Search performed.

Route::get('test', function () {
    return City::search('buzau')
        ->query(fn (Builder $query) => $query->with(['county']))
        ->paginate(10)->onEachSide(1)->withQueryString()
        ->through(fn ($obj) => [
            'name' => $obj->name,
            'city' => $obj->county?->name ?? '',
        ]);
});

Paginated search results.

{
  "current_page": 1,
  "data": [
    {
      "name": "Buzau",
      "city": "Judetul Buzau"
    },
    {
      "name": "Comuna Munteni Buzau",
      "city": "Judetul Ialomita"
    },
    {
      "name": "Dealul Viei",
      "city": "Judetul Buzau"
    },
    {
      "name": "Dealul Viei",
      "city": "Judetul Buzau"
    },
    {
      "name": "Valcele",
      "city": "Judetul Buzau"
    },
    {
      "name": "Trestioara",
      "city": "Judetul Buzau"
    },
    {
      "name": "Suditi",
      "city": "Judetul Buzau"
    },
    {
      "name": "Satu Nou",
      "city": "Judetul Buzau"
    },
    {
      "name": "Recea",
      "city": "Judetul Buzau"
    },
    {
      "name": "Posta",
      "city": "Judetul Buzau"
    }
  ],
  "first_page_url": "https://bliss-spa.test/test?query=buzau&page=1",
  "from": 1,
  "last_page": 1,
  "last_page_url": "https://bliss-spa.test/test?query=buzau&page=1",
  "links": [
    {
      "url": null,
      "label": "« Previous",
      "active": false
    },
    {
      "url": "https://bliss-spa.test/test?query=buzau&page=1",
      "label": "1",
      "active": true
    },
    {
      "url": null,
      "label": "Next »",
      "active": false
    }
  ],
  "next_page_url": null,
  "path": "https://bliss-spa.test/test",
  "per_page": 10,
  "prev_page_url": null,
  "to": 10,
  "total": 10
}

I verified the total number in the Typesense Dashboard

Screenshot 2024-06-13 at 6 56 59 PM

So I think the problem lies somewhere in the getTotalCount method in the Builder class https://github.com/laravel/scout/blob/10.x/src/Builder.php#L486

Please let me know how else I can help with this.

Thank you!

Raz

AbdullahFaqeir commented 2 months ago

Hey @jasonbosco @razvaniacob

I don't know how I've seen the whole execution within my head to figure this out 😅 I literally didn't even go through the code.

What's actually causing the issue is what happens when you apply a customisation to the query (db) which happens after the (search) query.

When @razvaniacob indexed his data, he didn't index the (id) of the city, that's first.

After the search results is returned normally with the correct total number count, while not just typesense, even other engines, generate a list of the plucked id's from the (search) query, using this list of id's the fetch the actual entities inside the database on order to apply the (db) query customisation on them.

In this case, there was no id's being returned from the (search) query, thus, when trying to execute the (db) query customisation, it returns null because, you know! 😅.

So I believe the only way for this to be fixed is to force index the models id's.

razvaniacob commented 2 months ago

Good observation. But actually the id is indexed to typesense and you can see the id in the typesense dashboard screenshot, but even with that I get the same result.

This is what I have in my City model:

Screenshot 2024-06-14 at 8 52 11 AM
AbdullahFaqeir commented 2 months ago

Good observation. But actually the id is indexed to typesense and you can see the id in the typesense dashboard screenshot, but even with that I get the same result.

This is what I have in my City model:

Screenshot 2024-06-14 at 8 52 11 AM

Tho is it defined in the typesense scheme?, check your previous comment

razvaniacob commented 2 months ago

I added the id to the typesense scheme and reindexed and still no change in the results. I get 10 results (the per page total) instead of 512

Screenshot 2024-06-14 at 9 40 34 AM
razvaniacob commented 2 months ago

@AbdullahFaqeir do you need anything else from me in order to finally find a fix for this problem?

jasonbosco commented 2 months ago

Mostly for @AbdullahFaqeir: Quick note that the id field is automatically generated by Typesense if a document sent to Typesense does not have a field called id defined in the document (even if the id field is not defined in the collection schema).

But if an id field is explicitly set in the document, then the specified value is used as the id of the document.

Unless exclude_fields or include_fields search parameters are used to exclude the id field from the Typesense search API response, this field will always be returned by Typesense.

jackkitley commented 2 weeks ago

Have the same issue when using ->query()

Issue is over 2 years old now.

        return $this->model->search($search)
            ->query(fn(Builder $query) => $query->with(['mainListingImage']))
            ->options([
                'query_by' => 'vehicle_description, transmission, fuel_type, colour, year'
            ])
            ->paginate()
            ->withQueryString();
tharropoulos commented 1 week ago

After spending some time debugging, I've come to the following findings:

When using the paginate function in Laravel Scout with the Typesense engine, the number of records returned is always limited to the perPage value, regardless of the actual total number of matches on the Typesense server. This behavior stems from the interaction between various methods within the Builder and TypesenseEngine classes.

Details

  1. Hits on the Current Page:
    The line here https://github.com/laravel/scout/blob/991056204a410fc7a9c22b1ec9bbbae6bb675241/src/Builder.php#L496 returns the number of hits on the current page, which will always be capped by the limit (default is Eloquent's perPage 15, I go into more detail about this below). This is evident here https://github.com/laravel/scout/blob/991056204a410fc7a9c22b1ec9bbbae6bb675241/src/Engines/TypesenseEngine.php#L357-L362 As mapIdsFrom just returns the mapIds function: https://github.com/laravel/scout/blob/991056204a410fc7a9c22b1ec9bbbae6bb675241/src/Engines/Engine.php#L111-L114

  2. Mutation of the ids Variable:
    The ids variable, as a result, is always less than the total matches, and it is then mutated by the keys method, which uses the tap function to return a cloned builder object: https://github.com/laravel/scout/blob/991056204a410fc7a9c22b1ec9bbbae6bb675241/src/Engines/Engine.php#L122-L125 This, in turn, will just return the same number as the previous call to the mapIds function.

  3. Limit Setting Logic:
    The logic here: https://github.com/laravel/scout/blob/991056204a410fc7a9c22b1ec9bbbae6bb675241/src/Builder.php#L500-L502 attempts to set the limit, but since the limit is initialized and always less than the totalAmount variable, this condition never triggers. Thus, the number of total records in TypesenseEngine defaults to the perPage value.

  4. Why: The paginate function on TypesenseEngine https://github.com/laravel/scout/blob/991056204a410fc7a9c22b1ec9bbbae6bb675241/src/Engines/TypesenseEngine.php#L206-L214 Accesses the builder's take function, which sets the limit attribute to the perPage parameter, which is never null. This way, the aforementioned check will always result in what perPage is equal to, and in turn, will result in the behavior you encountered.

Fix

Remove the assignment of perPage's value to the builder's limit attribute. I'll create a relevant PR shortly, so you can check it out yourselves.

razvaniacob commented 1 week ago

Wow.. good job @tharropoulos I was able to track down the issue you found but didn't have time to come up with a fix. I wasn't sure if removing the assignment of perPage to the builder's limit attribute won't affect other search engines, because I found it odd that it was there to begin with.

I'm happy you're on this problem!

tharropoulos commented 1 week ago

Wow.. good job @tharropoulos I was able to track down the issue you found but didn't have time to come up with a fix. I wasn't sure if removing the assignment of perPage to the builder's limit attribute won't affect other search engines, because I found it odd that it was there to begin with.

I'm happy you're on this problem!

I've manually tested it but haven't created any new tests to verify. If need be, I'll be sure to include some! Rest of the tests seem to pass as expected