vitessio / vitess

Vitess is a database clustering system for horizontal scaling of MySQL.
http://vitess.io
Apache License 2.0
18.73k stars 2.1k forks source link

Inserting to a reference table causes (non-fatal) vtgate panic #5729

Open aquarapid opened 4 years ago

aquarapid commented 4 years ago

Scenario:

The panic from the vtgate log file:

E0116 15:59:08.786825 1163644 server.go:278] mysql_server caught panic:
runtime error: index out of range [0] with length 0
/usr/lib/golang/src/runtime/panic.go:75 (0x42f482)
        goPanicIndex: panic(boundsError{x: int64(x), signed: true, y: y, code: boundsIndex})
/home/jacques/go/src/vitess.io/vitess/go/vt/vtgate/engine/insert.go:397 (0xaca2df)
        io/vitess/go/vt/vtgate/engine.(*Insert).getInsertShardedRoute: keyspaceIDs, err := ins.processPrimary(vcursor, vindexRowsValues[0], ins.Table.ColumnVindexes[0])
/home/jacques/go/src/vitess.io/vitess/go/vt/vtgate/engine/insert.go:275 (0xac7878)
        io/vitess/go/vt/vtgate/engine.(*Insert).execInsertSharded: rss, queries, err := ins.getInsertShardedRoute(vcursor, bindVars)
/home/jacques/go/src/vitess.io/vitess/go/vt/vtgate/engine/insert.go:225 (0xac6de2)
        io/vitess/go/vt/vtgate/engine.(*Insert).Execute: return ins.execInsertSharded(vcursor, bindVars)
/home/jacques/go/src/vitess.io/vitess/go/vt/vtgate/executor.go:321 (0xb39257)
        io/vitess/go/vt/vtgate.(*Executor).handleExec: qr, err := plan.Instructions.Execute(vcursor, bindVars, true)
/home/jacques/go/src/vitess.io/vitess/go/vt/vtgate/executor.go:215 (0xb37adc)
        io/vitess/go/vt/vtgate.(*Executor).execute: qr, err := e.handleExec(ctx, safeSession, sql, bindVars, destKeyspace, destTabletType, dest, logStats, stmtType)
/home/jacques/go/src/vitess.io/vitess/go/vt/vtgate/executor.go:139 (0xb3766b)
        io/vitess/go/vt/vtgate.(*Executor).Execute: result, err = e.execute(ctx, safeSession, sql, bindVars, logStats)
/home/jacques/go/src/vitess.io/vitess/go/vt/vtgate/vtgate.go:280 (0xb5d0c1)
        io/vitess/go/vt/vtgate.(*VTGate).Execute: qr, err = vtg.executor.Execute(ctx, "Execute", NewSafeSession(session), sql, bindVariables)
/home/jacques/go/src/vitess.io/vitess/go/vt/vtgate/plugin_mysql_server.go:211 (0xb4ac41)
        io/vitess/go/vt/vtgate.(*vtgateHandler).ComQuery: session, result, err := vh.vtg.Execute(ctx, session, query, make(map[string]*querypb.BindVariable))
/home/jacques/go/src/vitess.io/vitess/go/mysql/conn.go:1070 (0x86c0b7)
        io/vitess/go/mysql.(*Conn).execQuery: err := handler.ComQuery(c, query, func(qr *sqltypes.Result) error {
/home/jacques/go/src/vitess.io/vitess/go/mysql/conn.go:771 (0x86b92d)
        io/vitess/go/mysql.(*Conn).handleNextCommand: if err := c.execQuery(sql, handler, more); err != nil {
/home/jacques/go/src/vitess.io/vitess/go/mysql/server.go:463 (0x8876ae)
        io/vitess/go/mysql.(*Listener).handle: err := c.handleNextCommand(l.handler)
/usr/lib/golang/src/runtime/asm_amd64.s:1357 (0x460930)
        goexit: BYTE    $0x90   // NOP

vtgate -version:

$ vtgate -version
Version: 2152b55aa (Git branch 'master') built on Wed Jan 15 18:09:04 PST 2020 .........

Obviously, reference tables don't work this way, and you should insert on a per-shard level, or use vreplication to populate the tables. However, inserting to it should not cause a vtgate panic and kill the connection.

usmanm commented 4 years ago

Hi @aquarapid! Stumbled across this issue while trying to understand reference tables a bit better. I have a question around:

you should insert on a per-shard level, or use vreplication to populate the tables.

How would we insert on a per-shard level to a reference table? My use-case is such that I want "sparsely" populated reference tables, so would ideally like to write on-demand to a shard if the reference data is missing there. But from the docs, I can't really figure out if there is a way to tell Vitess to write to a particular shard?

aquarapid commented 4 years ago

@usmanm in the above example, inserting into the specific shard would mean USE-ing the specific shard, and then inserting. Assuming the keyspace name in the example above is keyspace1, it would look like this:

mysql> use keyspace1/0;
Database changed
mysql> insert into testing (`version`,`inserted_at`) VALUES (20200108224539, "2020-01-16 23:31:51");
Query OK, 1 row affected (0.01 sec)

Note that populating reference tables like this manually isn't recommended. The original idea is to use a separate keyspace where the "source" of your reference tables lives. This keyspace would be probably unsharded, since reference tables are assumed to be small. You setup vReplication to materialize them continuously from that keyspace to the shards of your sharded "target" keyspace. That way, consistency would not be a concern.

sougou commented 4 years ago

Generally, reference tables shouldn't be directly updated. What you should instead do is create the source in an unsharded keyspace and vreplicate from that into the sharded one. Then you just update the source table.

sougou commented 4 years ago

I just remembered this demo shows how reference tables should be configured. It's a little outdated, but the general idea is still the same. https://youtu.be/E6H4bgJ3Z6c?t=1451

usmanm commented 4 years ago

The thing I'm not sure about VReplication is that does it require us to remove referential integrity? Suppose if I want something like:

# This is a reference table (so I want it to be replicated to all shards)
CREATE TABLE users (
  id   INT NOT NULL PRIMARY KEY,
  name VARCHAR(256) NOT NULL
);

# This is in a sharded keyspace
CREATE TABLE events (
  id   INT NOT NULL PRIMARY KEY # Primary VIndex on this column
  data VARCHAR(256)
  user INT REFERENCES users(user_id)
);

There's some propagation delay between writing to the unsharded table and then replicating it to all shards of a keyspace? In that case wouldn't it be possible that I try to write something to events and it fails because the referenced user data has not been propagated to the shard?

sougou commented 4 years ago

VReplication typically applies changes to the target within milliseconds. Essentially, it should be no worse than the application trying to write the change to all shards. But it guarantees eventual consistency even if there are failures, which is a complex problem to solve at the app level.

The way I would recommend implementing the feature is to have a function write to the source, and wait for the row to appear in the target before returning success.

Some users have requested that vitess itself do this part. It's something we've been considering.

usmanm commented 4 years ago

That makes sense! And "waiting" for row to appear would is running just a SELECT query or does Vitess provide some out-of-the-box thing here?

Another question that I had (tangentially related) is around availability. When using reference tables, is the availability == the unsharded keyspace MySQL instance to be up? In the sense that for the event table, I can achieve better availability by just retrying and writing it to a different shard (id is generated application side, so we have a bit of control here).

sougou commented 4 years ago

There are a couple of convoluted ways:

We can also build a custom construct to avoid the above convolution. Or, we can prioritize the wait functionality in vitess itself.

About availability: The master-replica setup with an automated failover mechanism has been serving multiple organizations with five nines of availability. So, it's fine to rely on its uptime.

usmanm commented 4 years ago

Find out the target shard and send a shard-targeted query.

How would one go about doing this? I tried looking at the docs, but couldn't find a way to find the target shard given a keyspace ID or a value for a column which the Primary Vindex is on.

sougou commented 4 years ago

You can issue this query, which will return the list of shards:

mysql> show vitess_shards like 'customer/%';
+--------------+
| Shards       |
+--------------+
| customer/-80 |
| customer/80- |
+--------------+
2 rows in set (0.00 sec)

Cache this info in the app. And then you can target a shard with:

mysql> use customer/-80;
mysql> select * from customer;

To find a shard from a keyspace id, you can mimic this function: https://github.com/vitessio/vitess/blob/1310711d44c1c48b1cc17c088659ad9eaf872aa9/go/vt/key/destination.go#L327-L338

This is obviously clumsy. For now, I hope you can hack this to make sure things work. We can find a more elegant way out of this later.

usmanm commented 4 years ago

Thanks so much @sougou, this has been extremely helpful 🙏

aquarapid commented 3 years ago

Just to confirm this is still an issue in 9.0 (daa6085); with updated stacktrace:

runtime error: index out of range [0] with length 0
/opt/hostedtoolcache/go/1.15.7/x64/src/runtime/panic.go:88 (0x436d64)
/home/runner/work/vitess/vitess/go/vt/vtgate/engine/insert.go:397 (0xbface4)
/home/runner/work/vitess/vitess/go/vt/vtgate/engine/insert.go:255 (0xbf829c)
/home/runner/work/vitess/vitess/go/vt/vtgate/engine/insert.go:201 (0xbf77e6)
/home/runner/work/vitess/vitess/go/vt/vtgate/plan_execute.go:175 (0xcd88ba)
/home/runner/work/vitess/vitess/go/vt/vtgate/plan_execute.go:155 (0xcb3d0e)
/home/runner/work/vitess/vitess/go/vt/vtgate/plan_execute.go:116 (0xcb349e)
/home/runner/work/vitess/vitess/go/vt/vtgate/executor.go:187 (0xca129a)
/home/runner/work/vitess/vitess/go/vt/vtgate/executor.go:155 (0xca0ec4)
/home/runner/work/vitess/vitess/go/vt/vtgate/vtgate.go:266 (0xcd14d7)
/home/runner/work/vitess/vitess/go/vt/vtgate/plugin_mysql_server.go:219 (0xcb53db)
/home/runner/work/vitess/vitess/go/mysql/conn.go:1234 (0x902b8b)
/home/runner/work/vitess/vitess/go/mysql/conn.go:1219 (0x9026b5)
/home/runner/work/vitess/vitess/go/mysql/conn.go:867 (0x8ff747)
/home/runner/work/vitess/vitess/go/mysql/server.go:474 (0x921d77)
/opt/hostedtoolcache/go/1.15.7/x64/src/runtime/asm_amd64.s:1374 (0x472180)
mattlord commented 2 years ago

@aquarapid should we close this? It seems to have gone stale and I'm unsure if it's still relevant or not. Thanks!