rethinkdb / rethinkdb

The open-source database for the realtime web.
https://rethinkdb.com
Other
26.76k stars 1.86k forks source link

JS driver: isConnection should be duck typed #4316

Open stuartpb opened 9 years ago

stuartpb commented 9 years ago

In a similar vein to #4315, it would be nice if .run validated connections via duck typing rather than instanceof, as that would allow:

stuartpb commented 9 years ago

Real-life example on that first point: I use https://github.com/stuartpb/endex to construct my database's db, index, and table creation queries (which are not executed if the db/index/table already exists). endex's package.json includes the dependency "rethinkdb": "^1.16.0", so it will not use any rethinkdb module greater than or equal to 2.0.0.

I include endex in an app that has the dependency "rethinkdb": "^2.0.0-1". I get a connection using r.connect(cfg.rethinkdb).then(conn=>{endex.db('myapp').run(conn)}. The prototype of conn is the Connection object from rethinkdb@2.0.0-1, but the query used by endex is comparing it against the Connection object from rethinkdb@1.16.2. Since these are not the same object, .run decides conn is not a valid RethinkDB connection, and throws up an error.

(If the notion of trying to use a value from 1.16 against 2.0 bothers you, imagine the dependencies are "rethinkdb": "2.0.0" and "rethinkdb": "2.0.0-1", which would still yield the same problem.)

marshall007 commented 9 years ago

I think this is a duplicate of #3263. You should be able to specify rethinkdb in the peerDependencies of your package.json. That way the caller's app and your package share the same instance.

stuartpb commented 9 years ago

You should be able to specify rethinkdb in the peerDependencies of your package.json. That way the caller's app and your package share the same instance.

That's a workaround for the first point for some use cases (it's not suitable for modules that are self-contained), but it does nothing to address the second point.

marshall007 commented 9 years ago

You're right, it's not a solution and I think it deserves a fix too. What do you mean by self-contained modules?

As for your second point, I think it would be difficult to manage a connection pool as an abstraction over the current Connection interface without significant changes. Connection pooling is being discussed in #281 so you may also want to mention over there that you're interested in the ability to pass in your own pool implementation and/or low level APIs for managing connections in the pool.

stuartpb commented 9 years ago

What do you mean by self-contained modules?

A number of different scenarios where a package is designed to provide rethinkdb without a peer installing it, like rethinkdbdash or thinky.

As for your second point, I think it would be difficult to manage a connection pool as an abstraction over the current Connection interface without significant changes.

Well, yeah, that's the idea: this would (in its ideal form) define a simpler abstraction for Connection that wouldn't be reliant on digging far into the internals of the AST for its interface. (That said, would it be that hard to implement a connection pool on top of the current connection interface? It seems to me you could just create a number of internal Connection objects in an array and then proxy the _start method to call out to each of them in rotation.)

Connection pooling is being discussed in #281 so you may also want to mention over there that you're interested in the ability to pass in your own pool implementation and/or low level APIs for managing connections in the pool.

Yeah, I'm familiar with #281, which is why I used connection pooling as an example - there are other useful extensions that would be easier to implement on top of an abstract connection interface, like #4315.

neumino commented 9 years ago

Just dropping a few thoughts. I used to be think that this was a bad idea. I think it's fine and actually has valid situations now (especially since there are more and more packages using rethinkdb). It's also something that would let me create fake connections for reqlite :-)

That being said, I just want to clarify something. The connections in rethinkdb and rethinkdbdash are not compatible and won't be (except if one of the driver changes a lot) because the way these two drivers build the query are different. rethinkdb call build and build the query at run time, while rethinkdbdash build it when each term is called (the query is stored in _query.

stuartpb commented 9 years ago

Maybe this should be split into two issues:

danielmewes commented 9 years ago

I think we already have https://github.com/rethinkdb/rethinkdb/issues/3263 for

Duck typing to make rethinkdb play nicer with connections created by similar versions of rethinkdb (easy)

So I suggest that we keep this about

Designing an abstract connection interface, to allow rethinkdb's query builder to play well with alternative connection implementations, and vice versa (hard)

(Edit: I missed that you had already opened https://github.com/rethinkdb/rethinkdb/issues/4317 as a separate issue. So let's use this to discuss a work around.)

Having an official abstract connection interface would be nice. I think it will take a little until we find the time to design one though. For now I suggest that we make the current run function work with other classes without explicitly specifying the expected interface, so that users who know what they are doing can mimic the existing connection interface in their own classes and pass them to run.

To provide a sufficient amount of type safety, I suggest that we add a special method or magic constant to the connection type that we can call / look at to make sure that the passed object is actually a RethinkDB connection. If someone wants to use their own connection implementation, they'd just need to duplicate that magic constant. At the same time it would protect against accidentally passing in something other than a connection and getting a bad error message.

danielmewes commented 9 years ago

(Also pinging @deontologician)

deontologician commented 9 years ago

I think duck typing should be sufficient. The only place we test the instanceof for a connection is in .run, and it's only a belt and suspenders check. We'd otherwise raise an exception if the thing passed to .run didn't have a _start method, which I think is exactly what this issue is suggesting we do. It's a private method (starting with an underscore) so it should be clear that the interface isn't guaranteed to be stable.

tl;dr: I think we should just drop the isConnection check in run and not worry about the magic number stuff. isConnection should remain as it is since it may be used in client code somewhere.

danielmewes commented 9 years ago

tl;dr: I think we should just drop the isConnection check in run and not worry about the magic number stuff. isConnection should remain as it is since it may be used in client code somewhere.

Sounds fine. We should just make sure that we give a decent error message if the _start method doesn't exist.

stuartpb commented 9 years ago

I agree with what @deontologician said about just checking for the existence of _start, but doesn't using isConnection in client code require some equal amount of unstable implementation guts-rooting like r.connect.__proto__.isConnection or require('./node_modules/rethinkdb/net.js').isConnection?

stuartpb commented 9 years ago

Also, I kind of agree with @danielmewes on using a sentinel value, similar to protodef.VersionDummy, instead of a test for the presence of _start to test for compatibility (so that if two different versions expect _start to behave differently, run won't go ahead and charge into an unreliable codepath), but that's with me having no familiarity with how much the internal implementation changes from version to version. It could be that strong integrity around edge cases would end up making the value virtually equivalent to instanceof when it comes to spanning versions.

stuartpb commented 9 years ago

Also, this is a terrible suggestion that is liable to wreck performance, but one possible way to work around this from userland could be to do something like conn.__proto__.__proto__ = Connection (although this still requires getting a connection from the original RethinkDB driver, which seems like a bit of a chicken-and-egg problem).

stuartpb commented 9 years ago

Basically, for potential duck-typing solutions, we have (in order of increasing difficulty/reward):

I'm actually kind of :-1: on the first one, with my layman's understanding of _start, due to its theoretical potential to cause small, unintended problems in a case where the _start implementation is not quite close enough between versions (if I pass in a 1.12 connection to a 2.0 query, will it try to use protobuf? Will the AST be compatible?).

I think rethinkdb should go with the version-magic approach for now (maybe falling back to a _start check with a warning and/or some more sniffing to detect potentially-incompatible connections), maybe cutting new patch releases to earlier compatible minor versions that implement the same version-magic cookie.

Then, later, after #3771 settles, there should be a stable interface for connections to target that would deprecate and supersede the version magic cookie (maybe using a table of translators to convert from the magic cookie's expected interface to the standard interface, if possible).

danielmewes commented 9 years ago

@stuartpb thanks for the great writeup. I agree with you and think option 2 is a good solution for now.