Closed kaplun closed 9 years ago
At the same time we use LOW_PRIORITY
in several places which looks even worse than DELAYED:
(from http://dev.mysql.com/doc/refman/5.6/en/insert.html):
If you use the LOW_PRIORITY keyword, execution of the INSERT is delayed until no other clients are reading from the table. This includes other clients that began reading while existing clients are reading, and while the INSERT LOW_PRIORITY statement is waiting. It is possible, therefore, for a client that issues an INSERT LOW_PRIORITY statement to wait for a very long time (or even forever) in a read-heavy environment
$ git grep -i "\blow_priority\b"
modules/bibdocfile/lib/bibdocfile_managedocfiles.py: run_sql("DELETE LOW_PRIORITY from bibfmt WHERE format='HB' AND id_bibrec=%s", (recid,))
modules/bibformat/lib/bibformat_dblayer.py: low_priority=False, compress=zlib.compress):
modules/bibformat/lib/bibformat_dblayer.py: if low_priority:
modules/bibformat/lib/bibformat_dblayer.py: sql_str = " LOW_PRIORITY"
modules/bibformat/lib/bibreformat.py: low_priority=True)
modules/bibupload/lib/bibupload.py: query = """INSERT LOW_PRIORITY INTO bibfmt (id_bibrec, format, last_updated, value)
modules/bibupload/lib/bibupload.py: query = """UPDATE LOW_PRIORITY bibfmt SET last_updated=%s, value=%s WHERE id_bibrec=%s AND format=%s"""
modules/bibupload/lib/bibupload.py: run_sql("DELETE LOW_PRIORITY FROM bibfmt WHERE id_bibrec=%s and format=%s", (id_bibrec, format_name))
modules/websession/lib/inveniogc.py: query = "DELETE LOW_PRIORITY FROM session WHERE session_expiry < %s"
modules/websession/lib/session.py: run_sql("""DELETE LOW_PRIORITY FROM session
modules/websession/lib/session.py: return run_sql("""DELETE LOW_PRIORITY FROM session
modules/websubmit/lib/functions/Move_Files_to_Storage.py: run_sql("DELETE LOW_PRIORITY from bibfmt WHERE format='HB' AND id_bibrec=%s", (sysno,))
modules/websubmit/lib/functions/Move_Photos_to_Storage.py: run_sql("DELETE LOW_PRIORITY from bibfmt WHERE format='HB' AND id_bibrec=%s", (sysno,))
modules/websubmit/lib/functions/Move_Revised_Files_to_Storage.py: run_sql("DELETE LOW_PRIORITY from bibfmt WHERE format='HB' AND id_bibrec=%s", (sysno,))
Since all this points could lead to undefinite delays in closing SQL session/connection (and therefore WSGI requests) is it ok to drop them as well?
I agree the description makes it look quite bad. I remember there was some motivation for adding both LOW_PRIORITY
and DELAYED
in the context of INSPIRE performance. Some deadlocks/delays were avoided I believe. Specifically related to bibfmt
and session
tables. Finally in the INSPIRE use case, we use redis for sessions so no problem. Still unsure about bibfmt, perhaps @Osso can provide more useful background for this?
Yes I see the a-priori benefit of using INSERT delayed in the context of storing these on-the-fly generated formats, so that the user who triggered the generation is not also impacted by having to wait for its storage. But writing hopefully writing to bibfmt
is not too slow if the benefit is avoiding WSGI going crazy? @osso?
If I recall writing to bibfmt
is slow because it is (or was) so huge on INSPIRE (all recstruct, all pre-cached formats etc.). Back then we discussed splitting the table with recstruct only and other generated formats separated. I dunno if we still need to do this, but considering we've been running (more or less) smoothly with these enabled we should investigate and do some profiling before enabling/disabling this.
Grr... GitHub!! I merged in INSPIRE, but I don't see why it has to be closed on Invenio... we have issues on inspirehep github as well...
As @tsgit reports:
So it looks like in the common case
INSERT DELAYED
is producing more harm than benefits. And this is more critical the bigger the site. GivenINSERT DELAYED
is not even SQL standard. I propose here to no longer use it.