elixir-ecto / myxql

MySQL 5.5+ driver for Elixir
Apache License 2.0
273 stars 67 forks source link

Tests run slower after switching to myxql #60

Closed michaelst closed 4 years ago

michaelst commented 5 years ago

We are noticing a significant (8x) slow down on our tests with the myxql driver when compared to mariaex. Our average test time is 200s with mariaex but 1600s with myxql. My coworker also noticed a similar performance impact when trying out the postgrex driver.

Is this a problem with our configuration or is this a known problem?

josevalim commented 5 years ago

Do we have a consistent way to reproduce it? I also wonder if maybe there are some flags being set in the connection that enforces some guarantees which slow things down.

1player commented 5 years ago

Unfortunately I don't, not at this time. Might be a few days before I can try and write a minimal test case.

No special flags are used at all, MySQL 5.6.30 on TCP connection, ported from mariaex to myxql just by changing the adapter module.

Another thing that's also interesting is that mariaex executed the whole 3 statements in about 1ms, while it's taken 90ms for myxql to do the same.

From a quick look at the log, all queries using the IN expression take a huge amount of time.

This wasn't running in a transaction:

14:25:20.347 [debug] QUERY OK source="football_events" db=0.4ms queue=0.3ms
SELECT DISTINCT f0.`cluster` FROM `football_events` AS f0 WHERE (f0.`id` IN (?,?)) AND (NOT (f0.`cluster` IS NULL)) [839, 838]
14:25:20.389 [debug] QUERY OK source="football_events" db=41.8ms
SELECT f0.`id`, f0.`event_group_id`, f0.`source_id`, f0.`topic_id`, f0.`cluster`, f0.`source_ref`, f0.`inserted_at`, f0.`updated_at`, f0.`time`, f0.`home`, f0.`away`, f0.`slug` FR
OM `football_events` AS f0 WHERE (f0.`cluster` = ?) ["4d6d98fd-ccbc-4c75-b2f0-d29762f73bd7"]

So at commit time (or the next statement if outside a transaction like the example above), an IN expression is about 400x slower than mariaex.

This is the optimization plan for that query:

mysql> explain SELECT DISTINCT f0.`cluster` FROM `football_events` AS f0 WHERE (f0.`id` IN (839, 838)) AND (NOT (f0.`cluster` IS NULL));
+----+-------------+-------+-------+---------------------------------------+---------+---------+------+------+------------------------------+
| id | select_type | table | type  | possible_keys                         | key     | key_len | ref  | rows | Extra                        |
+----+-------------+-------+-------+---------------------------------------+---------+---------+------+------+------------------------------+
|  1 | SIMPLE      | f0    | range | PRIMARY,football_events_cluster_index | PRIMARY | 8       | NULL |    2 | Using where; Using temporary |
+----+-------------+-------+-------+---------------------------------------+---------+---------+------+------+------------------------------+

So should be pretty fast, on a 30K rows table (though the logs are from an empty test database sandbox)

josevalim commented 5 years ago

@1player and even executing this IN query by hand (instead of Ecto) reproduces said slow downs? (I want to make sure this is no an Ecto issue but I am starting to have some suspicions). If they do, can you try doing something like this:

Repo.query!("... QUERY WITH IN...")
Repo.query!("... QUERY WITH IN...", cache_statement: :omg)
Repo.query!("... QUERY WITH IN...", cache_statement: :omg)

If the first one is slow, is the third one fast enough?

1player commented 5 years ago

I could only reproduce the issue when the query was in a transaction:

Outside a transaction:

iex(2)> Repo.query!("SELECT DISTINCT f0.`cluster` FROM `football_events` AS f0 WHERE (f0.`id` IN (1, 2)) AND (NOT (f0.`cluster` IS NULL))")
[debug] QUERY OK db=0.8ms decode=0.8ms queue=2.2ms
SELECT DISTINCT f0.`cluster` FROM `football_events` AS f0 WHERE (f0.`id` IN (1, 2)) AND (NOT (f0.`cluster` IS NULL)) []
%MyXQL.Result{
  columns: ["cluster"],
  connection_id: 362,
  last_insert_id: nil,
  num_rows: 1,
  num_warnings: 0,
  rows: [["a09fa14f-e68a-471f-9c0b-f38ffa01ba3c"]]
}
iex(3)> Repo.query!("SELECT DISTINCT f0.`cluster` FROM `football_events` AS f0 WHERE (f0.`id` IN (1, 2)) AND (NOT (f0.`cluster` IS NULL))")
[debug] QUERY OK db=0.6ms queue=0.7ms
SELECT DISTINCT f0.`cluster` FROM `football_events` AS f0 WHERE (f0.`id` IN (1, 2)) AND (NOT (f0.`cluster` IS NULL)) []
%MyXQL.Result{
  columns: ["cluster"],
  connection_id: 356,
  last_insert_id: nil,
  num_rows: 1,
  num_warnings: 0,
  rows: [["a09fa14f-e68a-471f-9c0b-f38ffa01ba3c"]]
}

In a transaction:

iex(5)> Repo.transaction(fn ->
...(5)>   Repo.query!("SELECT DISTINCT f0.`cluster` FROM `football_events` AS f0 WHERE (f0.`id` IN (1, 2)) AND (NOT (f0.`cluster` IS NULL))")
...(5)>   Repo.query!("SELECT DISTINCT f0.`cluster` FROM `football_events` AS f0 WHERE (f0.`id` IN (1, 2)) AND (NOT (f0.`cluster` IS NULL))")
...(5)> end)
[debug] QUERY OK db=0.5ms
begin []
[debug] QUERY OK db=0.5ms
SELECT DISTINCT f0.`cluster` FROM `football_events` AS f0 WHERE (f0.`id` IN (1, 2)) AND (NOT (f0.`cluster` IS NULL)) []
[debug] QUERY OK db=0.5ms
SELECT DISTINCT f0.`cluster` FROM `football_events` AS f0 WHERE (f0.`id` IN (1, 2)) AND (NOT (f0.`cluster` IS NULL)) []
[debug] QUERY OK db=46.0ms
commit []
{:ok,
 %MyXQL.Result{
   columns: ["cluster"],
   connection_id: 356,
   last_insert_id: nil,
   num_rows: 1,
   num_warnings: 0,
   rows: [["a09fa14f-e68a-471f-9c0b-f38ffa01ba3c"]]
 }}

I couldn't get cache_statement to work, I haven't dived in the code to understand what's the issue here.

iex(6)> Repo.query!("SELECT DISTINCT f0.`cluster` FROM `football_events` AS f0 WHERE (f0.`id` IN (1, 2)) AND (NOT (f0.`cluster` IS NULL))", cache_statement: :omg)
[debug] QUERY ERROR db=8.4ms queue=0.8ms
SELECT DISTINCT f0.`cluster` FROM `football_events` AS f0 WHERE (f0.`id` IN (1, 2)) AND (NOT (f0.`cluster` IS NULL)) [cache_statement: :omg]
** (ArgumentError) parameters must be of length 0 for query %MyXQL.Query{cache: :reference, name: "", num_params: 0, ref: #Reference<0.2402499918.4217372677.208699>, statement: "SELECT DISTINCT f0.`cluster` FROM `football_events` AS f0 WHERE (f0.`id` IN (1, 2)) AND (NOT (f0.`cluster` IS NULL))", statement_id: 1}
    (myxql) lib/myxql/query.ex:53: DBConnection.Query.MyXQL.Query.encode/3
    (db_connection) lib/db_connection.ex:1148: DBConnection.encode/5
    (db_connection) lib/db_connection.ex:1246: DBConnection.run_prepare_execute/5
    (db_connection) lib/db_connection.ex:1342: DBConnection.run/6
    (db_connection) lib/db_connection.ex:540: DBConnection.parsed_prepare_execute/5
    (db_connection) lib/db_connection.ex:533: DBConnection.prepare_execute/4
    (myxql) lib/myxql.ex:213: MyXQL.do_query/4
    (ecto_sql) lib/ecto/adapters/sql.ex:292: Ecto.Adapters.SQL.query!/4
1player commented 5 years ago

Same transaction running on the mysql client:

mysql> set profiling=1;
Query OK, 0 rows affected, 1 warning (0.01 sec)

mysql> begin;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT DISTINCT f0.`cluster` FROM `football_events` AS f0 WHERE (f0.`id` IN (1, 2)) AND (NOT (f0.`cluster` IS NULL));
+--------------------------------------+
| cluster                              |
+--------------------------------------+
| a09fa14f-e68a-471f-9c0b-f38ffa01ba3c |
+--------------------------------------+
1 row in set (0.00 sec)

mysql> SELECT DISTINCT f0.`cluster` FROM `football_events` AS f0 WHERE (f0.`id` IN (1, 2)) AND (NOT (f0.`cluster` IS NULL));
+--------------------------------------+
| cluster                              |
+--------------------------------------+
| a09fa14f-e68a-471f-9c0b-f38ffa01ba3c |
+--------------------------------------+
1 row in set (0.00 sec)

mysql> commit;
Query OK, 0 rows affected (0.00 sec)

mysql> show profiles;
+----------+------------+----------------------------------------------------------------------------------------------------------------------+
| Query_ID | Duration   | Query                                                                                                                |
+----------+------------+----------------------------------------------------------------------------------------------------------------------+
|        1 | 0.00012475 | begin                                                                                                                |
|        2 | 0.00047775 | SELECT DISTINCT f0.`cluster` FROM `football_events` AS f0 WHERE (f0.`id` IN (1, 2)) AND (NOT (f0.`cluster` IS NULL)) |
|        3 | 0.00041850 | SELECT DISTINCT f0.`cluster` FROM `football_events` AS f0 WHERE (f0.`id` IN (1, 2)) AND (NOT (f0.`cluster` IS NULL)) |
|        4 | 0.00014750 | commit                                                                                                               |
+----------+------------+----------------------------------------------------------------------------------------------------------------------+
4 rows in set, 1 warning (0.00 sec)
wojtekmach commented 5 years ago

Repo.query!("SELECT DISTINCT f0.clusterFROMfootball_eventsAS f0 WHERE (f0.idIN (1, 2)) AND (NOT (f0.clusterIS NULL))", cache_statement: :omg) ** (ArgumentError) parameters must be of length 0 for query %MyXQL.Query{cache: :reference, name: "", num_params: 0, ref: #Reference<0.2402499918.4217372677.208699>, statement: "SELECT DISTINCT f0.cluster FROM football_events AS f0 WHERE (f0.id IN (1, 2)) AND (NOT (f0.cluster IS NULL))", statement_id: 1}

the problem is you're doing:

Repo.query!(statement, cache_statement: :omg)

and options are treated as params, it should be:

Repo.query!(statement, [], cache_statement: :omg)
josevalim commented 5 years ago

Thanks @wojtekmach, it was my bad.

1player commented 5 years ago

Good catch, thanks. Yes, the cache_statement makes it tons better:

same cache key:

iex(9)> Repo.transaction(fn ->
...(9)>   Repo.query!("SELECT DISTINCT f0.`cluster` FROM `football_events` AS f0 WHERE (f0.`id` IN (1, 2)) AND (NOT (f0.`cluster` IS NULL))", [], cache_statement: :foo)
...(9)>   Repo.query!("SELECT DISTINCT f0.`cluster` FROM `football_events` AS f0 WHERE (f0.`id` IN (1, 2)) AND (NOT (f0.`cluster` IS NULL))", [], cache_statement: :foo)
...(9)> end)
[debug] QUERY OK db=0.5ms
begin []
[debug] QUERY OK db=0.6ms
SELECT DISTINCT f0.`cluster` FROM `football_events` AS f0 WHERE (f0.`id` IN (1, 2)) AND (NOT (f0.`cluster` IS NULL)) []
[debug] QUERY OK db=0.5ms
SELECT DISTINCT f0.`cluster` FROM `football_events` AS f0 WHERE (f0.`id` IN (1, 2)) AND (NOT (f0.`cluster` IS NULL)) []
[debug] QUERY OK db=0.3ms
commit []

different cache key:

iex(12)> Repo.transaction(fn ->
...(12)>   Repo.query!("SELECT DISTINCT f0.`cluster` FROM `football_events` AS f0 WHERE (f0.`id` IN (1, 2)) AND (NOT (f0.`cluster` IS NULL))", [], cache_statement: :one)
...(12)>   Repo.query!("SELECT DISTINCT f0.`cluster` FROM `football_events` AS f0 WHERE (f0.`id` IN (1, 2)) AND (NOT (f0.`cluster` IS NULL))", [], cache_statement: :two)
...(12)>   Repo.query!("SELECT DISTINCT f0.`cluster` FROM `football_events` AS f0 WHERE (f0.`id` IN (1, 2)) AND (NOT (f0.`cluster` IS NULL))", [], cache_statement: :three)
...(12)> end)
[debug] QUERY OK db=0.6ms
begin []
[debug] QUERY OK db=0.6ms
SELECT DISTINCT f0.`cluster` FROM `football_events` AS f0 WHERE (f0.`id` IN (1, 2)) AND (NOT (f0.`cluster` IS NULL)) []
[debug] QUERY OK db=0.3ms
SELECT DISTINCT f0.`cluster` FROM `football_events` AS f0 WHERE (f0.`id` IN (1, 2)) AND (NOT (f0.`cluster` IS NULL)) []
[debug] QUERY OK db=0.5ms
SELECT DISTINCT f0.`cluster` FROM `football_events` AS f0 WHERE (f0.`id` IN (1, 2)) AND (NOT (f0.`cluster` IS NULL)) []
[debug] QUERY OK db=0.2ms
commit []

I would expect the last example to take as long as the uncached version, as they're all using a different caching key. And why does it happen only when inside a transaction?

josevalim commented 5 years ago

I am honestly as surprised as you are. It is also the second query that is slow but not the first one? Also, why simply passing a cache statement but not even using it makes things fast?

1player commented 5 years ago

Not sure if the MySQL query cache plays any role in this, but no, as long as cache_statement is present, it's always as fast as I'd expect, even the first query. Again, it's hard to tell because I can reproduce the issue only when inside a transaction/at commit time. I've restarted the iex session a few times, and it was always < 1ms if cache_statement was present.

I've just tried replacing the IN with an OR statement, and it's exhibiting the same problems. It must be something due to the transaction.

OR version, no cache statement:

iex(14)> Repo.transaction(fn ->
...(14)>   Repo.query!("SELECT DISTINCT f0.`cluster` FROM `football_events` AS f0 WHERE (f0.`id` = 1 OR f0.`id` = 2) AND (NOT (f0.`cluster` IS NULL))")
...(14)>   Repo.query!("SELECT DISTINCT f0.`cluster` FROM `football_events` AS f0 WHERE (f0.`id` = 1 OR f0.`id` = 2) AND (NOT (f0.`cluster` IS NULL))")
...(14)> end)
[debug] QUERY OK db=0.5ms
begin []
[debug] QUERY OK db=0.6ms
SELECT DISTINCT f0.`cluster` FROM `football_events` AS f0 WHERE (f0.`id` = 1 OR f0.`id` = 2) AND (NOT (f0.`cluster` IS NULL)) []
[debug] QUERY OK db=0.6ms
SELECT DISTINCT f0.`cluster` FROM `football_events` AS f0 WHERE (f0.`id` = 1 OR f0.`id` = 2) AND (NOT (f0.`cluster` IS NULL)) []
[debug] QUERY OK db=40.9ms
commit []

OR version, cache_statement w/ different keys:

iex(15)> Repo.transaction(fn ->
...(15)>   Repo.query!("SELECT DISTINCT f0.`cluster` FROM `football_events` AS f0 WHERE (f0.`id` = 1 OR f0.`id` = 2) AND (NOT (f0.`cluster` IS NULL))", [], cache_statement: :or_one)
...(15)>   Repo.query!("SELECT DISTINCT f0.`cluster` FROM `football_events` AS f0 WHERE (f0.`id` = 1 OR f0.`id` = 2) AND (NOT (f0.`cluster` IS NULL))", [], cache_statement: :or_two)
...(15)> end)
[debug] QUERY OK db=0.4ms
begin []
[debug] QUERY OK db=0.5ms
SELECT DISTINCT f0.`cluster` FROM `football_events` AS f0 WHERE (f0.`id` = 1 OR f0.`id` = 2) AND (NOT (f0.`cluster` IS NULL)) []
[debug] QUERY OK db=0.5ms
SELECT DISTINCT f0.`cluster` FROM `football_events` AS f0 WHERE (f0.`id` = 1 OR f0.`id` = 2) AND (NOT (f0.`cluster` IS NULL)) []
[debug] QUERY OK db=0.3ms
commit []
wojtekmach commented 5 years ago

Thanks @1player. Could you check this branch and see if you'd have similar test times to mariaex?

{:myxql, github: "elixir-ecto/myxql", branch: "wm-cache", override: true}
1player commented 5 years ago

Tried the new branch, cleared build and _deps altogether.

Unfortunately, no difference at all. no cache_statement = slow transaction.

josevalim commented 5 years ago

@1player can you please call mix deps.update myxql on that same branch and try again? Make sure you are on commit: e72c13f9a1b787eb31d56df8f02d0a91a300d082

1player commented 5 years ago

Great stuff! 👍

Everything seems as fast as mariaex now, can't reproduce the transaction issue anymore, and all tests take as short as before.

michaelst commented 5 years ago

@josevalim that commit improved our test times down to 330s so now just 80s slower than mariaex (was 250s with mariaex)

k-cross commented 5 years ago

adding the config option ownership_timeout: :infinity has removed the crash. I thought that because I was using stream data instead of propcheck that the generators might have been slower. I'm pushing an update to the repo that I've been using to test so that you might be able to get some more insight out of it. thanks for referencing the issue.

wojtekmach commented 5 years ago

@k-cross in case you haven't tried it yet, I think what @josevalim alluded to was trying out your project on:

{:myxql, github: "elixir-ecto/myxql", branch: "wm-cache", override: true}

(make sure you're on commit e72c13f)

which should speed things up and so perhaps you wouldn't run into ownership error.

k-cross commented 5 years ago

Thanks, I will give that a shot!

k-cross commented 5 years ago

put up a new commit and it still bombed after 143 runs.

wojtekmach commented 5 years ago

Gotcha. Thanks for checking.

dbhobbs commented 5 years ago

The wm-cache branch shows a dramatic improvement in our test suite performance compared to release 0.2.9.

k-cross commented 5 years ago

I added a mariaex version to that toy project in the branch mariaex in order to compare, seems like I get the same issue with both branches in this case so that's interesting. I'll try to isolate differences that I see in the other project and open a separate issue. Thanks for all your work!

wojtekmach commented 5 years ago

@k-cross thanks for looking into it!

michaelst commented 5 years ago

When can we expect those changes (wm-cache branch) to be released?

wojtekmach commented 5 years ago

The code on wm-cache makes it so that basically every prepared statement is cached (and we don't clean up the cache) so that's not suitable for production. But we have ideas to improve caching in that direction, I'm hoping to get something by the end of the week.

k-cross commented 4 years ago

Oh cool, will we see a release soon?

josevalim commented 4 years ago

Relatively soon but please run against master and let us know how it goes.

michaelst commented 4 years ago

I ran our tests against that commit and everything is looking good! Thanks @josevalim!