Closed JamesFX2 closed 2 years ago
Hi @JamesFX2. Thank you for your report. To speed up processing of this issue, make sure that you provided the following information:
Make sure that the issue is reproducible on the vanilla Magento instance following Steps to reproduce. To deploy vanilla Magento instance on our environment, Add a comment to the issue:
@magento give me 2.4-develop instance
- upcoming 2.4.x release
For more details, review the Magento Contributor Assistant documentation.
Add a comment to assign the issue: @magento I am working on this
To learn more about issue processing workflow, refer to the Code Contributions.
Join Magento Community Engineering Slack and ask your questions in #github channel.
:warning: According to the Magento Contribution requirements, all issues must go through the Community Contributions Triage process. Community Contributions Triage is a public meeting.
:clock10: You can find the schedule on the Magento Community Calendar page.
:telephone_receiver: The triage of issues happens in the queue order. If you want to speed up the delivery of your contribution, join the Community Contributions Triage session to discuss the appropriate ticket.
:pencil2: Feel free to post questions/proposals/feedback related to the Community Contributions Triage process to the corresponding Slack Channel
Hi @engcom-Lima. Thank you for working on this issue. In order to make sure that issue has enough information and ready for development, please read and check the following instruction: :point_down:
[ ] 1. Verify that issue has all the required information. (Preconditions, Steps to reproduce, Expected result, Actual result).Details
If the issue has a valid description, the label Issue: Format is valid
will be added to the issue automatically. Please, edit issue description if needed, until label Issue: Format is valid
appears.
[ ] 2. Verify that issue has a meaningful description and provides enough information to reproduce the issue. If the report is valid, add Issue: Clear Description
label to the issue by yourself.
[ ] 3. Add Component: XXXXX
label(s) to the ticket, indicating the components it may be related to.
[ ] 4. Verify that the issue is reproducible on 2.4-develop
branchDetails
- Add the comment @magento give me 2.4-develop instance
to deploy test instance on Magento infrastructure.
- If the issue is reproducible on 2.4-develop
branch, please, add the label Reproduced on 2.4.x
.
- If the issue is not reproducible, add your comment that issue is not reproducible and close the issue and stop verification process here!
[ ] 5. Add label Issue: Confirmed
once verification is complete.
[ ] 6. Make sure that automatic system confirms that report has been added to the backlog.
Hi @JamesFX2 ,
Thanks for your contribution and collaboration.
I have tried to reproduce the issue but Issue is not reproducible to me.
Creating 6.7 million orders is not possible in local so, I have tried to test it with 1.5lakh orders using perfomance fixtures and I ran cron by using command bin/magento cron:run --group aggregate_sales_report_order_data
and timezone = "Brasilia Standard Time (America/Recife)" .Cron ran successfully without any error.
Kindly provide us more information in order to reproduce the issue and any thing is wrong in above steps ,please inform us.
Thanks
Hi @JamesFX2 ,
We have noticed that this issue has not been updated for a period of 14 Days. Hence we assume that this issue is fixed now, so we are closing it.Please raise a fresh ticket or reopen this ticket if you need more assistance on this.
Thanks
@engcom-Lima please just benchmark it
The full query is
INSERT INTO `sales_order_aggregated_created` (`period`, `store_id`, `order_status`, `orders_count`, `total_qty_ordered`, `total_qty_invoiced`, `total_income_amount`, `total_revenue_amount`, `total_profit_amount`, `total_invoiced_amount`, `total_canceled_amount`, `total_paid_amount`, `total_refunded_amount`, `total_tax_amount`, `total_tax_amount_actual`, `total_shipping_amount`, `total_shipping_amount_actual`, `total_discount_amount`, `total_discount_amount_actual`) SELECT DATE(DATE_ADD(`o`.`created_at`, INTERVAL 3600 SECOND)) AS `period`, `o`.`store_id`, `o`.`status` AS `order_status`, COUNT(o.entity_id) AS `orders_count`, SUM(oi.total_qty_ordered) AS `total_qty_ordered`, SUM(oi.total_qty_invoiced) AS `total_qty_invoiced`, SUM((IFNULL(o.base_grand_total, 0) - IFNULL(o.base_total_canceled, 0)) * IFNULL(o.base_to_global_rate, 0)) AS `total_income_amount`, SUM((IFNULL(o.base_total_invoiced, 0) - IFNULL(o.base_tax_invoiced, 0) - IFNULL(o.base_shipping_invoiced, 0) - (IFNULL(o.base_total_refunded, 0) - IFNULL(o.base_tax_refunded, 0) - IFNULL(o.base_shipping_refunded, 0))) * IFNULL(o.base_to_global_rate, 0)) AS `total_revenue_amount`, SUM((IFNULL(o.base_total_paid, 0) - IFNULL(o.base_total_refunded, 0) - IFNULL(o.base_tax_invoiced, 0) - IFNULL(o.base_shipping_invoiced, 0) - IFNULL(o.base_total_invoiced_cost, 0)) * IFNULL(o.base_to_global_rate, 0)) AS `total_profit_amount`, SUM(IFNULL(o.base_total_invoiced, 0) * IFNULL(o.base_to_global_rate, 0)) AS `total_invoiced_amount`, SUM(IFNULL(o.base_total_canceled, 0) * IFNULL(o.base_to_global_rate, 0)) AS `total_canceled_amount`, SUM(IFNULL(o.base_total_paid, 0) * IFNULL(o.base_to_global_rate, 0)) AS `total_paid_amount`, SUM(IFNULL(o.base_total_refunded, 0) * IFNULL(o.base_to_global_rate, 0)) AS `total_refunded_amount`, SUM((IFNULL(o.base_tax_amount, 0) - IFNULL(o.base_tax_canceled, 0)) * IFNULL(o.base_to_global_rate, 0)) AS `total_tax_amount`, SUM((IFNULL(o.base_tax_invoiced, 0) -IFNULL(o.base_tax_refunded, 0)) * IFNULL(o.base_to_global_rate, 0)) AS `total_tax_amount_actual`, SUM((IFNULL(o.base_shipping_amount, 0) - IFNULL(o.base_shipping_canceled, 0)) * IFNULL(o.base_to_global_rate, 0)) AS `total_shipping_amount`, SUM((IFNULL(o.base_shipping_invoiced, 0) - IFNULL(o.base_shipping_refunded, 0)) * IFNULL(o.base_to_global_rate, 0)) AS `total_shipping_amount_actual`, SUM((ABS(IFNULL(o.base_discount_amount, 0)) - IFNULL(o.base_discount_canceled, 0)) * IFNULL(o.base_to_global_rate, 0)) AS `total_discount_amount`, SUM((IFNULL(o.base_discount_invoiced, 0) - IFNULL(o.base_discount_refunded, 0)) * IFNULL(o.base_to_global_rate, 0)) AS `total_discount_amount_actual` FROM `sales_order` AS `o`
INNER JOIN (SELECT `sales_order_item`.`order_id`, SUM(qty_ordered - IFNULL(qty_canceled, 0)) AS `total_qty_ordered`, SUM(qty_invoiced) AS `total_qty_invoiced` FROM `sales_order_item` WHERE (parent_item_id IS NULL) GROUP BY `order_id`) AS `oi` ON oi.order_id = o.entity_id WHERE (o.state NOT IN ('pending_payment', 'new')) GROUP BY DATE(DATE_ADD(`o`.`created_at`, INTERVAL 3600 SECOND)),
`o`.`store_id`,
`o`.`status` HAVING (period LIKE '2022-10-03' OR period LIKE '2022-10-04') ON DUPLICATE KEY UPDATE `period` = VALUES(`period`), `store_id` = VALUES(`store_id`), `order_status` = VALUES(`order_status`), `orders_count` = VALUES(`orders_count`), `total_qty_ordered` = VALUES(`total_qty_ordered`), `total_qty_invoiced` = VALUES(`total_qty_invoiced`), `total_income_amount` = VALUES(`total_income_amount`), `total_revenue_amount` = VALUES(`total_revenue_amount`), `total_profit_amount` = VALUES(`total_profit_amount`), `total_invoiced_amount` = VALUES(`total_invoiced_amount`), `total_canceled_amount` = VALUES(`total_canceled_amount`), `total_paid_amount` = VALUES(`total_paid_amount`), `total_refunded_amount` = VALUES(`total_refunded_amount`), `total_tax_amount` = VALUES(`total_tax_amount`), `total_tax_amount_actual` = VALUES(`total_tax_amount_actual`), `total_shipping_amount` = VALUES(`total_shipping_amount`), `total_shipping_amount_actual` = VALUES(`total_shipping_amount_actual`), `total_discount_amount` = VALUES(`total_discount_amount`), `total_discount_amount_actual` = VALUES(`total_discount_amount_actual`)
The issue is HAVING (period LIKE '2022-10-03' OR period LIKE '2022-10-04')
having is based on the result so no filters applied until after the result - basically an unfiltered table.
Where is period defined?
SELECT DATE(DATE_ADD(o
.created_at
, INTERVAL 3600 SECOND)) AS period
If we replace
HAVING (period LIKE '2022-10-03' OR period LIKE '2022-10-04')
with
WHERE `o`.created_at > "2022-10-02 23:00:00" AND `o`.created_at < "2022-10-04 23:00:00
then we reduce the amount of records going into the query.
Please compare the peformance of
select DATE(DATE_ADD(`so`.`created_at`, INTERVAL 3600 SECOND)) AS `period`, so.* from sales_order so HAVING (period LIKE '2022-10-03' OR period LIKE '2022-10-04');
to
select DATE(DATE_ADD(`so`.`created_at`, INTERVAL 3600 SECOND)) AS `period`, so.* from sales_order so WHERE `so`.created_at > "2022-10-02 23:00:00" AND `so`.created_at < "2022-10-04 23:00:00
Both return the same results.
Is it possible to please escalate the issue? Even your 150,000 orders should be able to see the difference.
Fixing this will improve performance significantly and reduce memory.
@kandy: could somebody from the performance team check this?
Preconditions and environment
Steps to reproduce
1, Allow cron aggregate_sales_report_order_data to run.
Expected result
Table sales_order_aggregated_created to be updated
Actual result
Additional information
I ran an explain on the query above.
Our sales_order_item is 7411MB in size, sales_order is 4456MB in size.
It looks like the query above is building a derived table to add a field called period which uses the store's timezone offset.
This query seems like it would be more efficient if it was pre-filtered on a created_at window derived from the store's timezone offset instead of only filtering the derived table. That way, we wouldn't need to build a derived table with millions of rows.
https://github.com/magento/magento2/blob/2.4-develop/app/code/Magento/Sales/Model/ResourceModel/Report/Order/Createdat.php#L222-L224
This is unlikely to be a M2 2.4.4 issue as much as a "we migrated servers in 2.4.4" issue and we undoubtedly could fix this by increasing tmp_table_size on our MySQL server but this cron would be significantly faster with a review.
Release note
No response
Triage and priority