laravel / framework

The Laravel Framework.
https://laravel.com
MIT License
32.22k stars 10.91k forks source link

Elequent lazy and chunk method make data duplicate #52893

Open normatov07 opened 4 hours ago

normatov07 commented 4 hours ago

Laravel Version

10.10

PHP Version

8.2

Database Driver & Version

PostgreSql 16

Description

I am trying to export my data from postgresql to excel and I have huge data to export so I need to used laravel lazy collection. but I noticed some strange results. My some real rows are lost and instead some other rows is duplicated or even threefolded.

Steps To Reproduce

I am trying to export my data from postgresql to excel and I have huge data to export so I need to used laravel lazy collection. but I noticed some strange results. My some real rows are lost and instead some other rows is duplicated or even threefolded. Then I use chunk method but result was the same. then I get all data using get() method and I saw my real data successfully, nothing is changed.

Here is my code which I used:

$this->query->chunk(200, function (Collection $items) {
            foreach ($items as $item) {
                $item = $item->toArray();
                Log::info("row" , $item);

               .........
           }
        });
  foreach ($this->query->lazy(200) as $item) {
                $item = $item->toArray();
                Log::info("row",$item);
               .......
}
Jacobs63 commented 2 hours ago

How are your results ordered?

What exactly is $this->query?

normatov07 commented 2 hours ago

How are your results ordered?

What exactly is $this->query?

$this->query is model instance. one duplicate row order: firstly shown on 712-row, secondly 850-row, thirdly 1075-row. And this order is different for other duplicate rows.

Also despite of some rows are duplicate, the real number of rows was not changed, instead some real rows was lost.

Jacobs63 commented 1 hour ago

How are your results ordered? What exactly is $this->query?

$this->query is model instance. one duplicate row order: firstly shown on 712-row, secondly 850-row, thirdly 1075-row. And this order is different for other duplicate rows.

Also despite of some rows are duplicate, the real number of rows was not changed, instead some real rows was lost.

My apologies, meant to ask how is your query ordered.

Could you perhaps show us the query - possibly via $query->dd() before using lazy or chunk?

normatov07 commented 1 hour ago

How are your results ordered? What exactly is $this->query?

$this->query is model instance. one duplicate row order: firstly shown on 712-row, secondly 850-row, thirdly 1075-row. And this order is different for other duplicate rows. Also despite of some rows are duplicate, the real number of rows was not changed, instead some real rows was lost.

My apologies, meant to ask how is your query ordered.

Could you perhaps show us the query - possibly via $query->dd() before using lazy or chunk?

here is query:

SELECT "table1".*,
       "table2"."contract_id" AS "contractId",
       "table3"."pin"          AS "pin",
       "table3"."first_name"     AS "firstName",
       "table3"."last_name"      AS "lastName",
       "table5"."id"           AS    "organizationId",
       "table5"."name"       AS    "organizationName",
       "table4"."id"           AS "table4Id",
       "table4"."name"         AS "table4Name"
FROM   "table1"
           INNER JOIN "table2"
                      ON "table1"."col" = "table2"."col"
           INNER JOIN "table3"
                      ON "table1"."col" = "table3"."id"
           INNER JOIN "table4"
                      ON "table3"."col" = "table4"."id"
           INNER JOIN "table5"
                      ON "table4"."col" = "table5"."id"
WHERE  "table1"."status" = 1
  AND "table5"."id" = '9c872fd6-631e-4a13-8270-9e57ca435727'
  AND "table1"."deleted_at" IS NULL
order by "table1"."created_at"

I just changed table names and field names.

Jacobs63 commented 46 minutes ago

Interesting, so it is not just a model query, this more than that.

Is created_at ambiguous or not? Inner join might create duplicates as well.

Could, perhaps, ordering by ID help? Did you try lazyByIdDesc or chunkByIdDesc?