aarondfrancis / fast-paginate

A fast implementation of offset/limit pagination for Laravel.
MIT License
1.21k stars 56 forks source link

Specify table for whereInRaw query #1

Closed joecampo closed 2 years ago

joecampo commented 2 years ago

Hey,

Thanks again for this awesome macro :)

This ensures that the column data returned is for the table being paginated through and not any merged in from any joins. For example, if you're using hasManyThrough, and both tables have a uuid column, the uuid column data is overridden by the joined in table's uuid column.

Here's an example:

class Account extends Model
{
    public function comments(): HasManyThrough
    {
        return $this->hasManyThrough(Comment::class, Post::class);
    }
}

If both Comment and Post has a uuid the column data returned through this macro would return the uuid for the Post instead of the anticipated uuid of the Comment

By specifying the table selects from table. instead of it'll make sure we get the correct data back.

aarondfrancis commented 2 years ago

So I saw that in your original one and then removed it because it assumes that the developer was originally selecting *, which may not be the case 😕

Also, since this doesn't actually use a join like the article, but rather a whereIn, I'm not fully understanding where the stuff is being overwritten.

Can you dump the query with and without your proposed modification and post them here? Thanks!

joecampo commented 2 years ago

Good point! You're totally right. The best way would be to grab whatever columns being selected and wrap each with table.column 🤔

To explain the issue:

So here's what happens with the current macro (if you're not specify columns):

$account->comments()->deferredPaginate();

select * from `comments` inner join `posts` on `posts`.`id` = `comments`.`comment_id` where `posts`.`account_id` = ? and `comments`.`id` in (12136, 12137, 12138, 12139, 12140, 12141, 12142, 12143, 12144, 12145, 12146, 12147, 12148, 12149, 12150, 12151, 12152)

So what'll happen if the posts table has the same columns as the comments, the returned column data will be from posts relevant S/O. In my use case, both tables have a UUID and I noticed that all the UUID's for the comments were actually the parent post's UUID.

after PR:

$account->comments()->deferredPaginate()

select `comments`.* from `comments` inner join `posts` on `posts`.`id` = `comments`.`product_id` where `posts`.`account_id` = ? and `comments`.`id` in (12136, 12137, 12138, 12139, 12140, 12141, 12142, 12143, 12144, 12145, 12146, 12147, 12148, 12149, 12150, 12151, 12152)
aarondfrancis commented 2 years ago

Ah ha, I see. Unfortunately we can grab the select and prefix it with a table, because they could be select things that aren't real columns! Subqueries, case statements, or whatever.

Here's my other question... what happens when you call $account->comments()->paginate()?

I'm curious if this issue is present outside of the scope of the deferred part, in Laravel's paginator itself.

Thanks for the intel

joecampo commented 2 years ago

Another good point. In my mind something like this could work, but there are so many more outside edge cases as you mentioned:

if (is_null($this->query->columns)) {
    $this->query->select("{$table}.*");
} else {
    $this->query->select(array_map(fn ($column) => "{$table}.{$column}", $this->query->columns));
}

I noted that the cursor method in the query builder checks if columns are null.

I tested the normal paginate() method and it does prefix the query with the table name.

select `comments`.*, `posts`.`account_id` as `laravel_through_key` from `comments` inner join `posts` on `posts`.`id` = `comments`.`post_id` where `posts`.`account_id` = ? limit 100 offset 0

If you do try to select columns specific columns with ->select() that could be ambiguous it'll just throw the SQL exception.

Could just do the null check on columns. If no columns are specifically set in the select – prefix the table. That would mirror the behavior of the current paginate method.

if (is_null($this->query->columns)) {
    $this->query->select("{$table}.*");
}

And for all the other cases you'd need to be more specific in the columns you're selecting.

aarondfrancis commented 2 years ago

Fascinating that the traditional paginate works and mine doesn't. That's definitely an issue. Let me mirror this structure locally and see how we can leverage the Laravel smarts without reinventing them. Thanks for helping me run this down

joecampo commented 2 years ago

Ah, so I think I uncovered how the regular paginate method is working.

I found that the HasManyThrough class actually has a paginate method that delegates down to the query builder's paginate method.

https://github.com/laravel/framework/blob/66c5afc3cb14f914990d8b8cbf79fa077887cb26/src/Illuminate/Database/Eloquent/Relations/HasManyThrough.php#L413

This is the bit of magic that appears to be prefixing the table to avoid this issue when using hasManyThrough:

https://github.com/laravel/framework/blob/66c5afc3cb14f914990d8b8cbf79fa077887cb26/src/Illuminate/Database/Eloquent/Relations/HasManyThrough.php#L458

aarondfrancis commented 2 years ago

Nice find! I'll check to see if any other relations have a custom paginate method and then see if we can address this in a standard way.

aarondfrancis commented 2 years ago

@joecampo give this a try... I made a few tweaks. Now instead of running the query myself to get the records I'm using Laravel's simple paginator, meaning it's delegating to those underlying Laravel methods, which is always my goal.

<?php
Builder::macro('deferredPaginate', function ($perPage = null, $columns = ['*'], $pageName = 'page', $page = null) {
    /** @var Builder $this */
    $model = $this->newModelInstance();
    $key = $model->getKeyName();

    /** @var LengthAwarePaginator $paginator */
    $paginator = $this->clone()
        // We don't need them for this query, they'll remain
        // on the query that actually gets the records.
        ->setEagerLoads([])
        // Only select the primary key, we'll get the full
        // records in a second query below.
        ->paginate($perPage, [$key], $pageName, $page);

    // Add our values in directly using "raw," instead of adding new bindings.
    // This is basically the `whereIntegerInRaw` that Laravel uses in some
    // places, but we're not guaranteed the primary keys are integers, so
    // we can't use that. We're sure that these values are safe because
    // they came directly from the database in the first place.
    $this->query->wheres[] = [
        'type' => 'InRaw',
        'column' => $key,
        // Get the key values from the records on the *current* page, without mutating them.
        'values' => $paginator->getCollection()->map->getRawOriginal($key)->toArray(),
        'boolean' => 'and'
    ];

    // simplePaginate increments by one to see if there's another page. We'll
    // decrement to counteract that since it's unnecessary in our situation.
    $page = $this->simplePaginate($paginator->perPage() - 1, $columns, null, 1);

    // Create a new paginator so that we can put our full records in,
    // not the ones that we modified to select only the primary key.
    return new LengthAwarePaginator(
        $page->items(), $paginator->total(), $paginator->perPage(), $paginator->currentPage(), $paginator->getOptions()
    );
});
aarondfrancis commented 2 years ago

~Hmm per page could be null. One sec.~

Updated!

joecampo commented 2 years ago

Hmm - getting that the id column is ambiguous. Here's the initial query when the macro function starts:

select * from `comments` inner join `posts` on `posts`.`id` = `comments`.`post_id` where `posts`.`account_id` = ?

I updated the paginator query and where query to use the table as it does in the current version to get beyond that.

But I'm still seeing the columns get overwritten:

"commentUuid": "bff87bc7-d57a-4b6a-810e-739f4ecb804f",
"postUuid": "bff87bc7-d57a-4b6a-810e-739f4ecb804f",
joecampo commented 2 years ago

So I verified when calling $this->simplePaginate() from the macro it isn't hitting the method on the HasManyThrough class, but rather just straight on the Eloquent Builder.

aarondfrancis commented 2 years ago

Ok, take 3 (or 4 or 5). I added a macro on relation itself to handle the selecting. I think this should do it.

Builder::macro('deferredPaginate', function ($perPage = null, $columns = ['*'], $pageName = 'page', $page = null) {
    /** @var Builder $this */
    $model = $this->newModelInstance();
    $key = $model->getKeyName();
    $table = $model->getTable();

    /** @var LengthAwarePaginator $paginator */
    $paginator = $this->clone()
        // We don't need them for this query, they'll remain
        // on the query that actually gets the records.
        ->setEagerLoads([])
        // Only select the primary key, we'll get the full
        // records in a second query below.
        ->paginate($perPage, [$table . '.' . $key], $pageName, $page);

    // Add our values in directly using "raw," instead of adding new bindings.
    // This is basically the `whereIntegerInRaw` that Laravel uses in some
    // places, but we're not guaranteed the primary keys are integers, so
    // we can't use that. We're sure that these values are safe because
    // they came directly from the database in the first place.
    $this->query->wheres[] = [
        'type' => 'InRaw',
        'column' => $table . '.' . $key,
        // Get the key values from the records on the *current* page, without mutating them.
        'values' => $paginator->getCollection()->map->getRawOriginal($key)->toArray(),
        'boolean' => 'and'
    ];

    // simplePaginate increments by one to see if there's another page. We'll
    // decrement to counteract that since it's unnecessary in our situation.
    $page = $this->simplePaginate($paginator->perPage() - 1, $columns, null, 1);

    // Create a new paginator so that we can put our full records in,
    // not the ones that we modified to select only the primary key.
    return new LengthAwarePaginator(
        $page->items(), $paginator->total(), $paginator->perPage(), $paginator->currentPage(), $paginator->getOptions()
    );
});

Relation::macro('deferredPaginate', function ($perPage = null, $columns = ['*'], $pageName = 'page', $page = null) {
    if ($this instanceof HasManyThrough || $this instanceof BelongsToMany) {
        $this->query->addSelect($this->shouldSelect($columns));
    }

    return tap($this->query->deferredPaginate($perPage, $columns, $pageName, $page), function ($paginator) {
        if ($this instanceof BelongsToMany) {
            $this->hydratePivotRelation($paginator->items());
        }
    });
});
joecampo commented 2 years ago

Adding the macro to Relation is genius 👍

This works perfectly 💥

aarondfrancis commented 2 years ago

We did it! Thanks for your help. Made it a clean commit here: https://github.com/hammerstonedev/deferred-pagination-macro/commit/801a8c4d912a75090c782d5a520915b09c2dfa64.

Let me know if you hit any more snags.