Closed gwharton closed 6 years ago
Spot the two occurrences of this bug this week. Managed to catch both in time before any damage done. I have the MYSQL DELETE fix setup as a cron job for my live and production stores, but this was a temporary store I installed for some destructive development which didn't have the MYSQL DELETE cron job set up.
Referenced here https://github.com/magento/magento2/issues/15190 and here https://github.com/magento/magento2/issues/4978
Any ETA on the update, is it gonna be in version 2.2.6 ?
That snowball effect is crashing a lot of servers worldwide ;)
Workaround to minimize the impact : Run the cron with "flock -w 1 cron.lock php bin/magento cron:run" this cron should be run every minute (it's very important because the heartbeat is scheduled every minute in magento like this the table will not be flooded exponentially
@Tofandel These people just don't care.
@Tofandel @ajankuv I believe that most of cron related issues were fixed in 2.2.5 (except default lifetime for records). Can you confirm that you try to reproduce it on this version?
No I'm on version 2.2.3, I was preparing an update to 2.2.5 but I didn't see the cron fix in any changelog @kandy I'll see after that update
It was done by community in https://github.com/magento/magento2/pull/12497 PR Also you can apply patches for 2.2.3 from this comment https://github.com/magento/magento2/pull/12497#issuecomment-401392609
I can confirm that in 2.2.5, certainly in my situation, that cron jobs that crash during execution are now marked as "error" in cron_schedule instead of sitting in the running state. This means that the cron cleanup code now properly deals with these jobs and successive jobs run as planned. Initial testing shows the race condition that allowed the cron_schedule table to grow infinitely in size is now fixed. Closing this issue as fixed. Phew!
Will there be a patch / fix for 2.1.x?
After updating from 2.2.3 to 2.2.5 I have the same Cron issue. The table is growing and I don't know how to proceed.
Yeah, since 2.2.5 ive had it twice so it looks like it isnt fixed. Look for the entry in cron-schedule table that is in the running state and delete that row. Magento will then clean up for you.
My bad, the table have little bit more that 10 000, but I get error
[Magento\Framework\DB\Adapter\DeadlockException]
SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction, query was: DELETE FROM cron_schedule
WHERE (status = 'missed') AND (job_code in ('currency_rates_update', 'backend_clean_cache', 'visitor_clean', 'catalog_index_refresh_price', 'catalog_product_flat_indexer_store_cleanup', 'catalog_product_outdated_price_values_cleanup', 'catalog_product_frontend_actions_flush', 'catalog_product_attribute_value_synchronize', 'catalogrule_apply_all', 'system_backup', 'security_clean_admin_expired_sessions', 'security_clean_password_reset_request_event_records', 'sales_clean_quotes', 'sales_clean_orders', 'aggregate_sales_report_order_data', 'aggregate_sales_report_invoiced_data', 'aggregate_sales_report_refunded_data', 'aggregate_sales_report_bestsellers_data', 'sales_grid_order_async_insert', 'sales_grid_order_invoice_async_insert', 'sales_grid_order_shipment_async_insert', 'sales_grid_order_creditmemo_async_insert', 'sales_sendorder
emails', 'sales_send_order_invoice_emails', 'sales_send_order_shipment_emails', 'sales_send_order_creditmemo_emails', 'paypal_fetch_settlement_reports', 'outdated_authentication_failures_cleanup', 'expired_tokens_cleanup', 'newsletter_send_all', 'analytics_subscribe', 'analytics_update', 'analytics_collect_data', 'magento_newrelicreporting_cron', 'catalog_product_alert', 'aggregate_sales_report_coupons_data', 'persistent_clear_expired', 'captcha_delete_old_attempts', 'captcha_delete_expired_images', 'aggregate_sales_report_shipment_data', 'sitemap_generate', 'get_amazon_capture_updates', 'get_amazon_authorization_updates', 'amazon_payments_process_queued_refunds', 'aggregate_sales_report_tax_data', 'weltpixel_backend_moduleinfoupdate', 'yosto_arp_rule_apply_all')) AND (created_at < '2018-07-22 12:20:02')
What this means? I didn't have such issues before updating to 2.2.5
So, I had a lot of pending jobs, deleted manually all of them and everything seems to be ok for day or two, but I started to receive the error like above again. When I have checked, I had a lot of pending entries again at the end of the table, manually deleted them, for now it's ok
There was another blocking query running at the same time it was doing a delete and it locked the table. Most likely it was the query that it uses to update job statuses. That query always runs long and if there's a several running it deadlocks.
Also is there any good reason why all the job codes need to be called out in that delete?
I see the same behaviour as @AgentGod. Multiple sites start creating deadlock errors after updating to 2.2.5. When I clear the cron_schedule it seems fine for a few days, then it starts again.
I believe I have seen this issue happen with an exception being thrown by dotmailer/dotmailer-magento2-extension. We have used composer patches to prevent the known exception codepath from ever being triggered, but this seems like an issue that will likely cause issues on prod servers (as it did for us, unfortunately) so just trying to bump the priority here.
Where I can find a patch ?
@mpropl You can also try the MageMojo extension. I have used on a couple of installations since recently. It seems to fix these issues as well: https://github.com/magemojo/m2-ce-cron
MageMojo work for me for a while, but now it hangs and does not perform cron tasks :(
MageMojo use file locks, so it will not work on the multinode installation
@kandy you shouldn't be running the cron on multiple nodes. Cron should be scheduled on one node and there is not an issue with file locks.
@mpropl we're going to need more information about your issue if you would like us to help: https://github.com/magemojo/m2-ce-cron/issues/51
@ericvhileman I does not think that use cron service on only one node is high-available solution. Also, this extension requires to run on unix-like OS and have to access to /proc/ filesystem.
@kandy dude... the cron docs LITERALLY say not to do this: "In a multi-node system, crontab can run on only one node. This applies to you only if you set up more than one webnode for reasons related to performance or scalability.". I can also think of many reasons why it's a terrible idea.
Source: https://devdocs.magento.com/guides/v2.2/config-guide/cli/config-cli-subcommands-cron.html
+1 to what @ericvhileman said about running crons only in one node.
@magento-engcom-team Please reopen. This is not resolved.
@Ctucker9233, Can I ask you to create new issues with concrete steps to reproduce? This ticket has many different issues described in the comments, so it hard to understand what concrete issues do you ask to resolve.
@magento-engcom-team Please reopen. This is not resolved. reproduced in 2.2.6
SELECT count() FROM cron_schedule
WHERE status
='pending';
+----------+
| count() |
+----------+
| 11888 |
+----------+
I can confirm this. Also 11000+ records in the table. One day after I truncated the table. Also 2.2.6
@acurvers @johnny-longneck Can you verify the settings for cron history lifetime is set to 60min, not to 1 week? It configured in SYSTEM CONFIGURATION > ADVANCED > System > Cron (Scheduled Tasks)
I had overwritten previous history lifetime to 740 as recommended somewhere. I switched back to "use global settings" which is now 60. This might seem to have resolved the table overload.
Cron Table now constant at ca. 1000 entries.
@kandy I did but it was closed as a duplicate
@acurvers @johnny-longneck Can you verify the settings for cron history lifetime is set to 60min, not to 1 week? It configured in SYSTEM CONFIGURATION > ADVANCED > System > Cron (Scheduled Tasks)
@kandy Which setting is this? Failure history lifetime?
I have been stuck on this same issue for several weeks. I think I finally got it figured out with the help of this post. I made a quick blog post about it so I can refer to it in the future. https://toweringmedia.com/blog/magento-2-cron-job-pending-jobs-never-cleared-solved/
Same issue in 2.3.3. After 24 hours on a clean install, the cron table is 4080 lines with 410 jobs pending
Why is this closed when it's an on-going issue?
@vincentteyssier Sounds normal behaviour to me. Default behaviour is every 15 minutes, magento will schedule in the next 20 minutes of cron jobs into the cron_schedule table. These will be in the pending state. Take a look at the times of your pending jobs. Are they all scheduled to run in the next 15 minutes or so. Once run they will change to success/failed status and will eventually be cleared. All of the lifetime timings of cron jobs in the cron_schedule table can be configured in admin config. With a fairly standard setup, you should see the cron_schedule table hover between 1000-4000 entries with a ratio of pending/successful jobs of between 25 and 75% depending on where in the 20 minute cycle you are.
@markojak If you believe this is still happening in 2.3-develop, then please create a new issue.
@gwharton you're right, after 24 more hours it is stable between 4k-4.1k and approx 300 pending
Is this issue actually fixed. I've just had two separate development servers use up over 1 million CPU seconds in a few hours as a result of the magento cron, on a pretty much vanilla installation of Magento 2.3.3 with 30 products and zero activity apart from the cron running
One server had 10000 rows in the cron schedule table and the other had just over 1000 (after I had changed the save history lifetime setting). I'm not sure I'm seeing the same issue that other people see with hundreds of thousands of jobs stacked up in the cron_schedule table so I'm not sure if I need to raise a separate ticket.
In the error logs the high cpu usage seemed to coincide with the database server rebooting and a high memory usage when creating a temporary table.
Looking through this thread it seems to me that the cron module does a huge amount of unnecessary work and it also seems to lack necessary indexes to make its queries fast. For example the clean up of missed jobs looks at the created_at column but this is not indexed so the query has to table scan. You can see how the cron grinds to a halt once the table gets large. Doesn't look like this is my problem though.
On further investigation it appears that my problem of excessive cpu usage is due to a bug in magento queue consumer logic in 2.3 that means they never complete. see https://github.com/magento/magento2/issues/17951
Probably not fixed. M2 devs loves to close issues because they are "task oriented" and have to fill their daily quota of finished tasks (without actually fixing anything).
However this problem is still present and we are having increase CPU usage over time with cronjobs and mysql slow queries.
I've not seen this specific bug in a long time on my production install. Currently on 2.3.3.
Please raise a new issue with full steps to reproduce if problem still persists for you.
Still a problem at least in 2.3.0
Still a problem in 2.3.4:
MariaDB [db]> SELECT count(*) FROM cron_schedule WHERE status='pending';
+----------+
| count(*) |
+----------+
| 261 |
+----------+
1 row in set (0.007 sec)
@zzvara see above comment https://github.com/magento/magento2/issues/11002#issuecomment-556889494 261 is within normal range - I'd suggest monitoring this over time to see if this increases.
I'm not sure. Magento is already using 12/24 cores constantly. I'll post more data for you to tackle down the problem. We are using Kubernetes and HelmReleases, it should be fairly easy to reproduce.
Not much is running on that Kubelet node, but Magento is using 12/24 cores.
The site itself has become incredibly slow as well. These CRON jobs are running all the time. They are all on default values of 2.3.4.
We are using the following deployment: https://github.com/bitnami/charts/tree/master/bitnami/magento
The following Dockerfile that contains all plugins:
FROM bitnami/magento:2.3.4-debian-10-r12
USER root
RUN apt update
RUN apt install wget -y
RUN apt install rsync -y
RUN apt install vim -y
ENV COMPOSER_MEMORY_LIMIT -1
# This is the home directory of Magento.
WORKDIR /opt/bitnami/magento/htdocs
ADD auth.json .
RUN chown bitnami:daemon auth.json || exit 1
USER bitnami:daemon
RUN composer require mageplaza/module-core
RUN composer require mageplaza/module-smtp
RUN composer require mageplaza/module-gdpr
RUN composer require mageplaza/magento-2-social-login
RUN composer require dhl/module-rates-express
RUN composer require mageplaza/magento-2-hungarian-language-pack:dev-master
COPY extensions/geo-ip-redirect-1.3.0-CE/app/code/Amasty /opt/bitnami/magento/htdocs/app/code/Amasty
USER root
RUN chown -R bitnami:daemon app
RUN chown -R bitnami:daemon pub
RUN find /opt/bitnami/magento/htdocs -type d -print0 | xargs -0 chmod 775 && find /opt/bitnami/magento/htdocs -type f -print0 | xargs -0 chmod 664 && chown -R bitnami:daemon /opt/bitnami/magento/htdocs
Dockerfile is frequently rebuilt and the container therefore restarted.
Here is the entry point:
echo "Adding Magento core cron entries."
ln -sf /opt/bitnami/magento/conf/cron /etc/cron.d/magento
/usr/sbin/cron
echo "Running upgrade on Magento."
sudo -u bitnami php -d memory_limit=4096M /opt/bitnami/magento/htdocs/bin/magento setup:upgrade -vvv || true
echo "Running compile on Magento."
sudo -u bitnami php -d memory_limit=4096M /opt/bitnami/magento/htdocs/bin/magento setup:di:compile -vvv || true
echo "Running deployment on Magento sources."
sudo -u bitnami php -d memory_limit=4096M /opt/bitnami/magento/htdocs/bin/magento setup:static-content:deploy -f -vvv || true
echo "Install and start Cron."
sudo -u bitnami php bin/magento cron:install || true
echo "Maybe adding custom cron entries."
if [ -f "/opt/bitnami/magento/htdocs/custom-cron-entries" ]; then
echo "Found custom cron entries, so adding it to 'crontab -e'."
echo "Custom cron entries are: "
cat /opt/bitnami/magento/htdocs/custom-cron-entries
runuser -l bitnami -c '(crontab -l 2>/dev/null; cat /opt/bitnami/magento/htdocs/custom-cron-entries) | crontab -'
fi
echo "Starting Apache..."
exec httpd -f /opt/bitnami/apache/conf/httpd.conf -D FOREGROUND
root@magento-788c7d54f-kxkx4:/opt/bitnami/magento/htdocs# cat /etc/cron.d/magento
* * * * * bitnami /opt/bitnami/php/bin/php /opt/bitnami/magento/htdocs/bin/magento cron:run
* * * * * bitnami /opt/bitnami/php/bin/php /opt/bitnami/magento/htdocs/update/cron.php
* * * * * bitnami /opt/bitnami/php/bin/php /opt/bitnami/magento/htdocs/bin/magento setup:cron:run
root@magento-788c7d54f-kxkx4:/opt/bitnami/magento/htdocs# cat /opt/bitnami/magento/htdocs/custom-cron-entries
0 1 * * * redacted_copy_backups_to_remote_server.sh
@marthasimons7777 Can I suggest you upgrade to supported version of Commerce where this issue is resolved?
Preconditions
Steps to reproduce
Expected result
Actual result
The cronjob steadily increases in the time taken to complete, at the moment it is taking around 30 seconds to complete, during which time, mysql and php are taking up heavy CPU usage.
A MYSQL query log shows magento churning through all the pending requests, but they are never marked as success. Hence the ever increasing list of jobs to run.
Snippet from the Mysql Query log below