Finalizing Some PubSub Details

In a new issue because, why the hell not.

To recap, we have:

Soft delete (on the _deleted field of every model)
MongoID's (on the _id) field of every model)
Subs on model types (listings) and on individual models (listings:123)
Initial sub on a key results in the entire set for that key being pushed
Object creation via Pub results in an automatic subscription to the newly generated object

Yes, some of these will require some minor changes to the client that I'll take care of.

Right now, relations work via a basic mechanism:

The "relating" object stores the ID of the thing it relates to (for example, an offer has a listing field that points to a listing ID).
The "related" object stores a list of IDs of the relating objects (for example, a listing has a field offers that's a list of offer IDs)
Problems

Nested Objects

Nested objects make diffing a pain and validating an annoyance. Diffing is self-explanatory (you start having to deal with presence, recursing, etc), and validating is fairly clear as well (if (foo.bar && foo.bar.baz)).

Solution? Don't use nesting. Practically speaking this shouldn't be a huge deal as we just probably won't nest much anyway, and there probably won't be a huge benefit to it. Leastwise, not more of a benefit than there is an annoyance/pain.

Relations

Relations suffer from race conditions. If another offer is made in the time it takes for the database to write the first new offer, the clients will miss one of them. Server-side we can do this fine via atomic operations, but the clients will miss out.

Solution: Not sure? The other approach is to remove the field from the model proper (client-side this can be implemented right back as a list, provided it's not sent back to the server) and use more complicated listening keys (listing:123:offers). You listen on that, and it's just like listening on a model type. Should allow code-reuse and be free from race conditions?

Large Data

So right now we're just chucking several megabytes of data back and forth in models, which is obviously not a good idea.

Solution: I have no idea. The issue we face is that the data still has to be there in the initial pub from the client; we can of course strip it out and such in the server but that seems asymmetrical. Maybe that's a necessary evil?

That's all for now; I may have more before too long.

Thoughts?

My thoughts

Can we use id for ids? Backbone expects it, and uses it in several places. I can probably hack the requirement out, but it seems not using more backbone.

I've been using singular keys, eg. listing:20 and offer:333.

The system does not currently auto-subscribe on create. If we're doing that we should chat more about the details.

please consult with me if you digging into the client code. The abstractions are still a little loose at this point, and I just want to make sure you don't do anything with unintended consequences. I'm fine with making whatever changes are required.

Nesting No nesting objects, sounds good to me.

Relationships Currently relationship changes are one-sided. Take listings and offers. listing.offers is never modified directly. Rather new offers are created with a listing field, and old offers are deleted. This results in a change to the listing, that the server performs and publishes. Unless I'm missing something, continuing along these lines should mean the data consistency is in the hands of the server/db.

However we should discuss race conditions in general. For example: A user makes an offer, then decided to withdraw (delete) that offer. At the same time the seller decides to accept the offer. How do we deal with this?

My thought is that the first change should always be accepted, and the second change, made without knowledge if the first, should be rejected. In order to do this modified timestamps would have to be passed with every change, and only assigned by the server. If we do operate this way we need to work out standard messaging to deliver these errors to the client.

Large Data Afaik, the only large data is the initial publish of a listing, that contains an image. After that the image should be converted to a url, which is small. Of course, right now that's not the case, since image handling isn't working on the server, but once it is it shoudn't be a problem. Unless I'm misunderstanding the problem.

Interesting, email comments don't use markdown...

ID's

My thought was that every model has a MongoID in the _id field, and that any model with a "pretty" ID would throw that into the id field. In the data layer in the client, we could just program it so that any model that's pub'd without an id just automatically sets the id field from _id. Would that work?

Singular model types are good; not sure why I was pluralizing them here.

I definitely won't make any client-side changes without consulting to make sure everything's fine and dandy.

Relations

The one-sided thing works well, but the main issue is that we don't have a good way to diff on the arrays. Whenever we add/remove a related offer, for example, we have to push the entire foo.offers array in the change. If two changes are made in a close timespan, it'll result in two different arrays being pushed, each missing the other change. If we had a way to push/pop those arrays then we'd be golden, but we don't; and building one seems like it'd be more trouble than it's worth? It's not an issue of data consistency on the server side, fortunately, but more that clients may miss the occasional update or delete.

I'm down with optimistic locking on all objects. Should work nicely.

Large Data

Yeah, it's not a problem if we do it that way; the only concern I had was that it sort of breaks the pub/sub metaphor, since the object that's being pushed is not the resulting object. It's less like a data push and more like a function call.

Misc

The more I've been thinking about this, the more I feel pubsub isn't quite the ideal strategy for what we're doing (granted, it might still be the best one out there). As we're working out various edge cases it feels like we're starting to move further and further away from simple model syncing. Is there a better way? I dunno.

Thinking out loud: What if we went with a combination of RPC and server side push? Essentially, ditch pub from the client side, and make the client's interactions all work by calling remote methods that always return something. For example, we have the same sub and unsub functions that return true if the object exists and false if it doesn't. We also have a create function that lets us create objects, and returns the id of the thing we just created. We then have a modify function that modifies objects; this returns true if it worked or false if the object was deleted or something. If the modify works, and the client is subbed to that object, the server pushes the resulting change down the line via a pub method. It's a bit more complicated I guess, but seems like it's closer to what we want to accomplish?

Shopcaster / hs-client