catalyst / moodle-report_coursesize

upgraded coursesize report
14 stars 19 forks source link

repost_async.php crashing mariadb 10.5.20 every time it runs. #54

Closed epsso closed 1 year ago

epsso commented 1 year ago

report_async.php is crashing mariadb 10.5.20 on my server every time it runs. I am running Moodle 3.11.14+ and coursesize 2022121900 and recently updated from mariadb 10.4 to 10.5. Mariadb started crashing once per min. I narrowed it down to the moodle cron and then to report_coursesize\task\report_async Once I moved report_async.php to a different directory outside of moodle, mariadb stopped crashing every time the moodle cron ran.

Here is the error that the cron job is reporting

Execute scheduled task: Course size (report_coursesize\task\report_async) ... started 13:36:01. Current memory use 3.8MB. Generating report_coursesize cache... ... used 6 dbqueries ... used 4.6684908866882 seconds Scheduled task failed: Course size (report_coursesize\task\report_async),Error writing to database !!! Error reading from database !!!

guillefd commented 1 year ago

Having the same error on many moodle installation with version 3.11, and 4.1 Mariadb 10.6.13 This is the mysql error log

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x7f05c4010750): 
              UPDATE mdl_report_coursesize rc
                      SET backupsize = (SELECT bf.filesize FROM (SELECT id AS course, SUM(filesize) AS filesize
              FROM (SELECT c.id, f.filesize
                      FROM mdl_course c
                      JOIN mdl_context cx ON cx.contextlevel = 50 AND cx.instanceid = c.id
                      JOIN mdl_files f ON f.contextid = cx.id AND f.component = 'backup') x
              GROUP BY id) bf WHERE bf.course = rc.course)
jgilevich commented 1 year ago

Same problem here, Moodle versions 3.9, 3.11 and 4.1 MariaDB 10.4.29. /var/log/messages

Jun  6 10:28:53 moodle mysqld: Server version: 10.4.29-MariaDB source revision: 2594da7a33580bf03590502a011679c878487d0c
Jun  6 10:28:53 moodle mysqld: key_buffer_size=134217728
Jun  6 10:28:53 moodle mysqld: read_buffer_size=131072
Jun  6 10:28:53 moodle mysqld: max_used_connections=9
Jun  6 10:28:53 moodle mysqld: max_threads=153
Jun  6 10:28:53 moodle mysqld: thread_count=14
Jun  6 10:28:53 moodle mysqld: It is possible that mysqld could use up to
Jun  6 10:28:53 moodle mysqld: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 467778 K  bytes of memory
Jun  6 10:28:53 moodle mysqld: Hope that's ok; if not, decrease some variables in the equation.
Jun  6 10:28:53 moodle mysqld: Thread pointer: 0x7f65400009a8
Jun  6 10:28:53 moodle mysqld: Attempting backtrace. You can use the following information to find out
Jun  6 10:28:53 moodle mysqld: where mysqld died. If you see no messages after this, something went
Jun  6 10:28:53 moodle mysqld: terribly wrong...
Jun  6 10:28:53 moodle mysqld: stack_bottom = 0x7f65b0131cf0 thread_stack 0x49000
Jun  6 10:28:53 moodle mysqld: /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x565338a1e56e]
Jun  6 10:28:53 moodle mysqld: /usr/sbin/mysqld(handle_fatal_signal+0x317)[0x565338470857]
Jun  6 10:28:53 moodle mysqld: sigaction.c:0(__restore_rt)[0x7f65b7e42630]
Jun  6 10:28:53 moodle mysqld: /usr/sbin/mysqld(_ZN13st_join_table21choose_best_splittingEjyPy+0x951)[0x5653383e5991]
Jun  6 10:28:53 moodle mysqld: /usr/sbin/mysqld(_Z16best_access_pathP4JOINP13st_join_tableyPK8POSITIONjbdPS3_S6_+0x12f)[0x5653382ca78f]
Jun  6 10:28:53 moodle mysqld: /usr/sbin/mysqld(+0x733ebe)[0x5653382cdebe]
Jun  6 10:28:53 moodle mysqld: /usr/sbin/mysqld(_Z11choose_planP4JOINy+0x254)[0x5653382ceb24]
Jun  6 10:28:53 moodle mysqld: /usr/sbin/mysqld(_ZN4JOIN14optimize_innerEv+0x301e)[0x5653382d636e]
Jun  6 10:28:53 moodle mysqld: /usr/sbin/mysqld(_ZN4JOIN8optimizeEv+0x9d)[0x5653382d6c6d]
Jun  6 10:28:53 moodle mysqld: /usr/sbin/mysqld(_ZN13st_select_lex31optimize_unflattened_subqueriesEb+0xd0)[0x565338264a70]
Jun  6 10:28:53 moodle mysqld: /usr/sbin/mysqld(_Z12mysql_updateP3THDP10TABLE_LISTR4ListI4ItemES6_PS4_jP8st_orderybPySA_+0x7b1)[0x5653383282f1]
Jun  6 10:28:53 moodle mysqld: /usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x3c00)[0x56533827a800]
Jun  6 10:28:53 moodle mysqld: /usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_statebb+0x22b)[0x56533827f50b]
Jun  6 10:28:53 moodle mysqld: /usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcjbb+0x1956)[0x565338281e26]
Jun  6 10:28:53 moodle mysqld: /usr/sbin/mysqld(_Z10do_commandP3THD+0x112)[0x565338283332]
Jun  6 10:28:53 moodle mysqld: /usr/sbin/mysqld(_Z24do_handle_one_connectionP7CONNECT+0x202)[0x565338367a92]
Jun  6 10:28:53 moodle mysqld: /usr/sbin/mysqld(handle_one_connection+0x3d)[0x565338367b7d]
Jun  6 10:28:53 moodle mysqld: pthread_create.c:0(start_thread)[0x7f65b7e3aea5]
Jun  6 10:28:53 moodle mysqld: /lib64/libc.so.6(clone+0x6d)[0x7f65b7355b0d]
Jun  6 10:28:53 moodle mysqld: Trying to get some variables.
Jun  6 10:28:53 moodle mysqld: Some pointers may be invalid and cause the dump to abort.
Jun  6 10:28:53 moodle mysqld: Query (0x7f654000fe80): UPDATE mdl_report_coursesize rc
Jun  6 10:28:53 moodle kernel: traps: mysqld[83527] general protection ip:5653383e5991 sp:7f65b012dd20 error:0 in mysqld[565337b9a000+14f5000]
Jun  6 10:28:53 moodle mysqld: SET backupsize = (SELECT bf.filesize FROM (SELECT id AS course, SUM(filesize) AS filesize
Jun  6 10:28:53 moodle mysqld: FROM (SELECT c.id, f.filesize
Jun  6 10:28:53 moodle mysqld: FROM mdl_course c
Jun  6 10:28:53 moodle mysqld: JOIN mdl_context cx ON cx.contextlevel = 50 AND cx.instanceid = c.id
Jun  6 10:28:53 moodle mysqld: JOIN mdl_files f ON f.contextid = cx.id AND f.component = 'backup') x
Jun  6 10:28:53 moodle mysqld: GROUP BY id) bf WHERE bf.course = rc.course)
Jun  6 10:28:53 moodle mysqld: Connection ID (thread ID): 501
Jun  6 10:28:53 moodle mysqld: Status: NOT_KILLED
Jun  6 10:28:53 moodle mysqld: Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engi    ne_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_mat    ch_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_increment    al=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=on,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for    _derived=on,split_materialized=on,condition_pushdown_for_subquery=on,rowid_filter=on,condition_pushdown_from_having=on
Jun  6 10:28:53 moodle mysqld: The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
Jun  6 10:28:53 moodle mysqld: information that should help you find out what is causing the crash.
Jun  6 10:28:53 moodle mysqld: Writing a core file...
Jun  6 10:28:53 moodle mysqld: Working directory at /var/lib/mysql
Jun  6 10:28:53 moodle mysqld: Resource Limits:
Jun  6 10:28:53 moodle mysqld: Limit                     Soft Limit           Hard Limit           Units
Jun  6 10:28:53 moodle mysqld: Max cpu time              unlimited            unlimited            seconds
Jun  6 10:28:53 moodle mysqld: Max file size             unlimited            unlimited            bytes
Jun  6 10:28:53 moodle mysqld: Max data size             unlimited            unlimited            bytes
Jun  6 10:28:53 moodle mysqld: Max stack size            8388608              unlimited            bytes
Jun  6 10:28:53 moodle mysqld: Max core file size        0                    unlimited            bytes
Jun  6 10:28:53 moodle mysqld: Max resident set          unlimited            unlimited            bytes
Jun  6 10:28:53 moodle mysqld: Max processes             63305                63305                processes
Jun  6 10:28:53 moodle mysqld: Max open files            32768                32768                files
Jun  6 10:28:53 moodle mysqld: Max locked memory         65536                65536                bytes
Jun  6 10:28:53 moodle mysqld: Max address space         unlimited            unlimited            bytes
Jun  6 10:28:53 moodle mysqld: Max file locks            unlimited            unlimited            locks
Jun  6 10:28:53 moodle mysqld: Max pending signals       63305                63305                signals
Jun  6 10:28:53 moodle mysqld: Max msgqueue size         819200               819200               bytes
Jun  6 10:28:53 moodle mysqld: Max nice priority         0                    0
Jun  6 10:28:53 moodle mysqld: Max realtime priority     0                    0
Jun  6 10:28:53 moodle mysqld: Max realtime timeout      unlimited            unlimited            us
Jun  6 10:28:53 moodle mysqld: Core pattern: core
Jun  6 10:28:53 moodle mysqld: Kernel version: Linux version 3.10.0-1160.90.1.el7.x86_64 (mockbuild@kbuilder.bsys.centos.org) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-44) (GCC    ) ) #1 SMP Thu May 4 15:21:22 UTC 2023
Jun  6 10:28:53 moodle systemd: mariadb.service: main process exited, code=killed, status=11/SEGV
Jun  6 10:28:53 moodle systemd: Unit mariadb.service entered failed state.
Jun  6 10:28:53 moodle systemd: mariadb.service failed.
jgilevich commented 1 year ago

Seem that updating MariaDB to version 10.4.30 solved the problem, probably a MariaDB bug... MariaDB version 10.4.30 release notes:

InnoDB

  • Server crashes in st_join_table::choose_best_splitting (MDEV-31403)
  • Crash with condition pushable into derived and containing outer reference (MDEV-31240)
  • Revert "MDEV-30473 : Do not allow GET_LOCK() / RELEASE_LOCK() in cluster" Optimizer
  • Crash with condition pushable into derived and containing outer reference (MDEV-31403 MDEV-31240)
  • Crash with EXPLAIN EXTENDED for multi-table update of system table (MDEV-31224)
danmarsden commented 1 year ago

Thanks for the update! I'd suspected a bug in mariadb due to the description but hadn't been able to look into it yet. Closing as not a bug with the plugin but let us know if we need to take another look.

ndunand commented 1 year ago

Hello @danmarsden ,

Had this issue weeks ago when we upgraded MariaDB from 10.5.x to 10.6.13, didn't identify this query as the 'culprit' at the moment. Only solution we could find was to downgrade MariaDB to 10.6.12 and this miraculously resolved the issue.

However:

Now tried MariaDB 11.0 on a test server and this issue appeared again, server crashes every minute because \report_coursesize\task\report_async is set to run "ASAP".

Running the query manually in the client also crashes the server :

UPDATE mdl_report_coursesize rc SET backupsize = (SELECT bf.filesize FROM (SELECT id AS course, SUM(filesize) AS filesize FROM (SELECT c.id, f.filesize FROM mdl_course c JOIN mdl_context cx ON cx.contextlevel = 50 AND cx.instanceid = c.id JOIN mdl_files f ON f.contextid = cx.id AND f.component = 'backup') x GROUP BY id) bf WHERE bf.course = rc.course)

Running Moodle 3.11.15 (Build: 20230612) and report_coursesize version 2022090900.

danmarsden commented 1 year ago

@ndunand there are a number of recent bug reports in Mariadb's jira tracker related to certain query types crashing the server - some even reference Moodle - you should check to see if the version of 11.0 that you were using had a fix for this issue - I note that mariadb pushed out an "out-of-band" release of 10.x versions to fix the crashing issues so you will need to make sure you are using a release that includes those fixes - I do note there are still some open issues at the moment but as I do not run mysql/mariadb normally myself, checking and tracing this isn't something I can quickly do as a volunteer.

In any case - this is an issue with mariadb that affects multiple queries in various produces (including Moodle) - you can also disable the async reporting for the plugin temporarily which may also help.

ndunand commented 1 year ago

Thanks @danmarsden your response finally pointed me in the right direction, MDEV-31436 on MariaDB's jira tracker, which indeed shows that even though not clearly stated the bug probably also affects other MariaDB versions up to last week's release.