inveniosoftware / invenio

Invenio digital library framework
https://invenio.readthedocs.io
MIT License
614 stars 290 forks source link

general: no INSERT delayed usage #2268

Closed kaplun closed 9 years ago

kaplun commented 9 years ago

As @tsgit reports:

Inspire was completely unreachable for ~17minutes this afternoon US
time, shortly after midnight CERN time
[...]
there are no errors in the logs on wns, but mysql processlist showed a
large number of "INSERT DELAYED" connections, e.g.
[...]
so it looks like we create a bottleneck here with spikes in search activity?

This appears quite similar to another recent outage I observed in real
time, when I didn't get a chance to look at mysql processlist in time.
[...]
the "DELAYED" is not doing what you think it is, because:

http://dev.mysql.com/doc/refman/5.5/en/insert-delayed.html
[...]
Beginning with MySQL 5.5.7, INSERT DELAYED is always handled as a
simple INSERT (that is, without the DELAYED option) whenever the value
of binlog_format is STATEMENT or MIXED. (In the latter case, the
statement no longer triggers a switch to row-based logging, and so is
logged using the statement-based format.)

So it looks like in the common case INSERT DELAYED is producing more harm than benefits. And this is more critical the bigger the site. Given INSERT DELAYED is not even SQL standard. I propose here to no longer use it.

kaplun commented 9 years ago

At the same time we use LOW_PRIORITY in several places which looks even worse than DELAYED: (from http://dev.mysql.com/doc/refman/5.6/en/insert.html):

If you use the LOW_PRIORITY keyword, execution of the INSERT is delayed until no other clients are reading from the table. This includes other clients that began reading while existing clients are reading, and while the INSERT LOW_PRIORITY statement is waiting. It is possible, therefore, for a client that issues an INSERT LOW_PRIORITY statement to wait for a very long time (or even forever) in a read-heavy environment

$ git grep -i "\blow_priority\b"
modules/bibdocfile/lib/bibdocfile_managedocfiles.py:    run_sql("DELETE LOW_PRIORITY from bibfmt WHERE format='HB' AND id_bibrec=%s", (recid,))
modules/bibformat/lib/bibformat_dblayer.py:                             low_priority=False, compress=zlib.compress):
modules/bibformat/lib/bibformat_dblayer.py:    if low_priority:
modules/bibformat/lib/bibformat_dblayer.py:        sql_str = " LOW_PRIORITY"
modules/bibformat/lib/bibreformat.py:                                 low_priority=True)
modules/bibupload/lib/bibupload.py:    query = """INSERT LOW_PRIORITY INTO bibfmt (id_bibrec, format, last_updated, value)
modules/bibupload/lib/bibupload.py:        query = """UPDATE LOW_PRIORITY bibfmt SET last_updated=%s, value=%s WHERE id_bibrec=%s AND format=%s"""
modules/bibupload/lib/bibupload.py:        run_sql("DELETE LOW_PRIORITY FROM bibfmt WHERE id_bibrec=%s and format=%s", (id_bibrec, format_name))
modules/websession/lib/inveniogc.py:        query = "DELETE LOW_PRIORITY FROM session WHERE session_expiry < %s"
modules/websession/lib/session.py:            run_sql("""DELETE LOW_PRIORITY FROM session
modules/websession/lib/session.py:        return run_sql("""DELETE LOW_PRIORITY FROM session
modules/websubmit/lib/functions/Move_Files_to_Storage.py:    run_sql("DELETE LOW_PRIORITY from bibfmt WHERE format='HB' AND id_bibrec=%s", (sysno,))
modules/websubmit/lib/functions/Move_Photos_to_Storage.py:    run_sql("DELETE LOW_PRIORITY from bibfmt WHERE format='HB' AND id_bibrec=%s", (sysno,))
modules/websubmit/lib/functions/Move_Revised_Files_to_Storage.py:    run_sql("DELETE LOW_PRIORITY from bibfmt WHERE format='HB' AND id_bibrec=%s", (sysno,))

Since all this points could lead to undefinite delays in closing SQL session/connection (and therefore WSGI requests) is it ok to drop them as well?

jalavik commented 9 years ago

I agree the description makes it look quite bad. I remember there was some motivation for adding both LOW_PRIORITY and DELAYED in the context of INSPIRE performance. Some deadlocks/delays were avoided I believe. Specifically related to bibfmt and session tables. Finally in the INSPIRE use case, we use redis for sessions so no problem. Still unsure about bibfmt, perhaps @Osso can provide more useful background for this?

kaplun commented 9 years ago

Yes I see the a-priori benefit of using INSERT delayed in the context of storing these on-the-fly generated formats, so that the user who triggered the generation is not also impacted by having to wait for its storage. But writing hopefully writing to bibfmt is not too slow if the benefit is avoiding WSGI going crazy? @osso?

jalavik commented 9 years ago

If I recall writing to bibfmt is slow because it is (or was) so huge on INSPIRE (all recstruct, all pre-cached formats etc.). Back then we discussed splitting the table with recstruct only and other generated formats separated. I dunno if we still need to do this, but considering we've been running (more or less) smoothly with these enabled we should investigate and do some profiling before enabling/disabling this.

jirikuncar commented 9 years ago

2269 is still open

kaplun commented 9 years ago

Grr... GitHub!! I merged in INSPIRE, but I don't see why it has to be closed on Invenio... we have issues on inspirehep github as well...