apigee-labs / transicator

Distribute Postgres logical replication data to many clients
Apache License 2.0
44 stars 6 forks source link

Build command line tool that can create a pre-built leveldb database to enable fast startup #6

Open kriswehner opened 8 years ago

kriswehner commented 8 years ago

In order to support clients with long offline durations, we need to maintain a very large local leveldb (for us, 7-10 days of WAL), but we cannot afford to take the time hit of loading that WAL at startup time.

What we need to do is:

gbrail commented 8 years ago

What if we added the ability for a changeserver to dump out its own database so that another changeserver can import it? That might be easier to control than relying on PG -- one changeserver can bootstrap another. (Might be tricky to implement in some deployment environments, however.)

kriswehner commented 8 years ago

The reason I phrased it the way I did is just the operational aspect: I'd like to "cook" a leveldb database and have it warm but on something like s3, rather than doing an on-demand snapshot and then loading it from a peer changeserver.

That said, I was also imagining this as a command line wrapper around the changeserver internals to avoid re-inventing the wheel.

All that said, I don't per se care that much about the implementation as long as there's a clean set of operational steps to get a warm db.

kriswehner commented 8 years ago

@gbrail I've done an initial port to rocksdb using gorocksdb to enable this. I'd love you to give it a once over and get your opinion. The storage & changeserver tests are all green, but I'm not sure your degree of confidence in them.

https://github.com/apigee-labs/transicator/compare/master...seatme:feature/rocksdb

gbrail commented 8 years ago

Whoa -- big change, but probably a good one.

I've worked with RocksDB before and found that it's reliable and fast. The only problem I had in the past was that it's a pain to depend on since it's not available on as many platforms as LevelDB. The cockroachDB thing makes that a lot easier though.

I'm curious, though -- the current "storage" interface just calls LevelDB using the C API. The RocksDB C API is just the LevelDB C API with the functions renamed, so it should be a very easy port. Did you consider just doing that?

(Too many years of Node.js programming has made me a stickler for understanding exactly what code we depend on and I try not to make a giant dependency tree when I can avoid ...)

On Sun, Nov 13, 2016 at 11:38 PM, Kris Wehner notifications@github.com wrote:

@gbrail https://github.com/gbrail I've done an initial port to rocksdb using gorocksdb to enable this. I'd love you to give it a once over and get your opinion. The storage & changeserver tests are all green, but I'm not sure your degree of confidence in them.

master...seatme:feature/rocksdb https://github.com/apigee-labs/transicator/compare/master...seatme:feature/rocksdb

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/apigee-labs/transicator/issues/6#issuecomment-260267510, or mute the thread https://github.com/notifications/unsubscribe-auth/AAf0a3nebdPUSTW-DzBIw0vWrPe-_EURks5q-A_8gaJpZM4Kpz9V .

Greg Brail | apigee https://apigee.com/ | twitter @gbrail http://twitter.com/gbrail @apigee https://twitter.com/apigee

kriswehner commented 8 years ago

My instinct is very much to let someone else smart who's focused on it maintain the C code. Since there's a strong, community driven database interface that we don't have to maintain, it lets us focus on the app code. In particular, my engineers don't tend to be C developers, and with gorocksdb, everything (including the comparator functions) is in golang, so it's much more accessible.

With all due respect to the node.js community, I don't think that instinct applies here :). Obviously we need to keep an eye on library quality, etc, but I'm very much a firm believer in using the community implementations.

gbrail commented 8 years ago

Go-rocksdb looks very nice. Let me take a look at your changes. I want to be a little careful since we have the LevelDB stuff already working pretty well. I also have a busy week. But the changes look like they're for the best so I will work on it.

BTW RocksDB has column families -- I may want to extend this work to use a separate column family for the "metadata." That also simplifies the comparator which is nice.

On Mon, Nov 14, 2016 at 10:00 AM, Kris Wehner notifications@github.com wrote:

My instinct is very much to let someone else smart who's focused on it maintain the C code. Since there's a strong, community driven database interface that we don't have to maintain, it lets us focus on the app code. In particular, my engineers don't tend to be C developers, and with gorocksdb, everything (including the comparator functions) is in golang, so it's much more accessible.

With all due respect to the node.js community, I don't think that instinct applies here :). Obviously we need to keep an eye on library quality, etc, but I'm very much a firm believer in using the community implementations.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/apigee-labs/transicator/issues/6#issuecomment-260410877, or mute the thread https://github.com/notifications/unsubscribe-auth/AAf0a8gmvdn2URQ6PmO3LEffyD0Qd0Bcks5q-KGugaJpZM4Kpz9V .

Greg Brail | apigee https://apigee.com/ | twitter @gbrail http://twitter.com/gbrail @apigee https://twitter.com/apigee