mit-pdos / noria

Fast web applications through dynamic, partially-stateful dataflow
Apache License 2.0
4.99k stars 244 forks source link

AUTO_INCREMENT support #66

Closed ekmartin closed 1 year ago

ekmartin commented 6 years ago

This implements support for auto incrementing IDs in base nodes.

To make this work with shards, all auto increment columns get their values held in a DataType::ID(shard, value) type. value here is always an i64. An alternative would be to encode the shard info into the numeric value itself, i.e. through use of the first bits.

The shard index is passed through the base with on_input. Some things to consider:

jonhoo commented 6 years ago

I think DataType::None is the value you'd want here. And then if DataType::None is given to the sharding function, it would choose a random shard? Though we'd probably want to ensure that that only happens for AUTO_INCREMENT columns, not everywhere in the graph.

Base nodes shouldn't be able to move at runtime (at least not at this point in time), so I wouldn't worry too much about that. The bigger concern (I think) is how we generate unique IDs in MySQL form (that is, a single number) from the (shard, value) tuple?

ekmartin commented 6 years ago

DataType::None sounds reasonable too - changed to that. MySQL allows both, but the 0 behavior can be disabled by enabling NO_AUTO_VALUE_ON_ZERO. I also pushed a commit to shard DataType::None keys randomly. Not sure if this is really bad default behavior outside of auto incrementing columns though - right now it would crash instead, is that better?

Also, not completely sure how the Sharding enum's Random option work, but it only applies to sharding further down the graph right (with base level sharding being handled in BatchSendHandle.enqueue)?

Regarding generating MySQL friendly IDs from the tuple: I guess there's a few options here if you want to keep incrementing IDs per shard. When I mentioned prefixing the ID with the shard index I was thinking about something like:

match id {
    DataType::ID(shard, value) => {
        let shard_size = (shard_count as f64).log2().ceil();
        // Create a 64-bit value where the shard takes up the `shard_size` first bits:
        let shard_prefixed = shard << (64 - shard_size);
        // Then OR that with the actual ID (should panic if `value` 
        // takes up more than 64 - `shard_size` bits):
        DataType::BigInt(shard_prefixed | value)
    }
}

I'm not sure where this should happen though. One option would be to implement it into the shim, but it could also replace DataType::ID completely.

ms705 commented 6 years ago

Let's revive this! The above plan sounds good to me. Auto-increment IDs that include a shard ID in the high-order bits are very reasonable. This guarantees uniqueness, but not sequentiality -- which is fine for most applications, I imagine.

My understanding is that MySQL actually doesn't make any guarantees with AUTO_INCREMENT columns when operating sharded across a cluster (not even guaranteeing uniqueness).

ekmartin commented 6 years ago

Got a fair share of thesis writing to do, but I'll take a look at in a few days! Still need to implement support for sending back the newly assigned ID through write-ACKs as well, so that it'll work with the shim.

@ms705: Do you think the higher-order bits thing should happen here or in the shim? The former would probably return a BigInt from here, whereas the latter would return a DataType::ID(shard, seq) which would then get translated in the shim. The ID approach might be problematic for overflow though, if it would need to potentially turn (u32, u64) into u64?

jonhoo commented 6 years ago

@ekmartin take your time! the deadline has now passed, so we are no longer in a rush. anything we can do to help?

ekmartin commented 6 years ago

Thanks @jonhoo! I'll let you know if there's anything in particular 👍

ms705 commented 6 years ago

@ekmartin Apologies, I missed your question! Having thought about it, this seems like something the shim should do -- you could imagine interfaces other than MySQL on top of Soup that implement different semantics (e.g., expose shard IDs to clients).

I suspect this PR has broken again due to many changes on master (for sure e.g,, sharding for None values is implemented there now). Would you have a chance to rebase it so that we can get it merged? Otherwise, @jonhoo or myself can do it (and also make the required shim changes).

ekmartin commented 6 years ago

Sorry about the delay! I've been traveling for the last few weeks.

I've merged in master now. This changes the current None sharding behavior from always going to 0 to round-robin, so if there's actually a use case for sharding on None other than auto increment that'll break (since two separate calls might go to different shards).

I also started looking at returning the auto generated ID from the insert methods. I'll let you know when I have a working prototype. Might take a few more days though as I'm a little short on time at the moment. If it's not an immediate rush I'd definitely like to finish it though!

ms705 commented 6 years ago

No worries, and no huge rush with this -- I don't think it conflicts (much) with the changes we've got incoming in the tokio branch.

I think we need to shard None consistently to the same shard. The reason is that, sometimes, regions of the graph are sharded by a column whose value can be NULL. We want all these NULLs (represented as DataType::None) to be processed at the same shard.

I guess the problem here is that we're overloading DataType::None to mean both NULL and "please put an auto-increment value"; maybe the solution is to have a separate DataType::AutoIncrement value?

ekmartin commented 6 years ago

Better late than never! Update:

ms705 commented 6 years ago

I've had a look over this, and I think it largely looks good!

Why do we need to pass previous_shard around in so many places, including to shard_by? I'm sure there's a good reason, but it wasn't evident to me from reading the code.

Also: this presumably still increases the memory footprint of every DataType due to the two added enum variants?

ekmartin commented 6 years ago

@ms705: The previous_shard is there to be able to do round-robin of the shard given to an auto incremented ID when an AutoIncrementRequest is given, here. The alternative (at least as I see it) would be to have shard_by be stateful and keep track of the last shard within it.

Regarding the memory footprint: I think the size should be the same, given that the tests in data.rs still pass?

I've also started working on implementing support for AutoIncrementID in the shim — I'll make a PR to distributary-mysql as soon as it's ready.