neumino / thinky

JavaScript ORM for RethinkDB
http://justonepixel.com/thinky/
Other
1.12k stars 128 forks source link

`hasMany` relations and deleting documents. #356

Closed AshHeskes closed 9 years ago

AshHeskes commented 9 years ago

I'm not sure how deleting documents on a hasMany relationship should work.

Here's an example.

var userModel = thinky.createModel( 'user', {

    name : String   
} );

var postsModel = thinky.createModel('post', {

    user_id : thinky.types.string().required(),
    body : String   
});

userModel.hasMany(postsModel, 'posts', 'id', 'user_id' );
postModel.belongsTo (userModel, 'user', 'user_id', 'id');

var user = new userModel({
    name : 'Ash',
    posts : [
        { body : 'a post' },
        { body : 'another post' }
        { body : 'yet another post' },, 
    ]
});

user.saveAll({ posts: true }).then(function(user){

    // I want to delete one of the posts.

    // Lets just try chopping the number of posts.
    user.posts.length = 2;

    user.saveAll({ posts : true }) // -> returns an error `user_id` is required. and I think only deletes the relation.

    // remove and delete the post first?

    var removed_post = user.posts [ 1 ];

    user.posts.length = 1;

    removed_post.delete().then(function(){

        user.saveAll({ posts : true }) // -> returns an error `user_id` is required.    
    });
});

What's going on here? Clearly thinky is trying to cast something to and instance of postModel and validate it. But why? Are the relations for the userModel not update when calling postModel.delete() thus it's trying to re-save the deleted but related postModel by removing it's required user_id field?

The above case maybe seem a tad simplistic but if you put that in the context of an put controller. There maybe many updates to the users posts including ones being added, updated and removed in a single transaction.

What is the correct way to handle this?

neumino commented 9 years ago

Ok, so there are a few issues with your code and the code can throw in different places

user.saveAll({ posts: true }).then(function(user){
    user.posts.length = 2;
    return user.saveAll({ posts : true })
});

This is will throw because your remove a post from the relation, so we are going to remove the relation between user and post[2]. Since the foreign key is saved in post[2] as user_id, we are going to set it as undefined, and since you say that this field is required, thinky will throw. So this is working as expected.

2.

user.saveAll({ posts: true }).then(function(user){
    var removed_post = user.posts [ 1 ];
    return removed_post.delete().
}).then(function(){
  user.saveAll({ posts : true }) // -> returns an error `user_id` is required.    
});

Here what happens is that we delete removed_post from the database, and we remove user but only on the surface, meaning that user.posts.length is 2. The thing is we keep references in joined document somewhere else to track updates/deletions, and I think doesn't properly clean it. It should probably be done here: https://github.com/neumino/thinky/blob/master/lib/document.js#L1544

The way to work around is once you call removed_post.delete(), get your user again before saving it again. One reason I probably didn't think about this case is that it's probably not common.

What's your real use case behind?

neumino commented 9 years ago

Still tagging it as a bug, since it's one I think.

AshHeskes commented 9 years ago

I agree in that I think it is a bug too.

The use case is pretty simple in that if you're creating an API with a PUT method where the sent resource contains the joined documents. It makes sense that the controller will create any newly added documents update any existing ones and delete any missing ones.

Here's maybe a clearer example than a user and posts.

var postModel = thinky.createModel({

    title : String,
    body : String   
});

var tagsModel = thinky.createModel({

    post_id : thinky.types.string().required,
    name : String   
});

postModel.hasMany(tagModel, 'tags', 'id', 'post_id'  );
tagModel.belongsTo(tagModel, 'post', 'post_id', 'post' );

app.post('/posts/', function(request, response){

    var post_data = _. pick (request.body, 'title', 'body'),
        tags_data = request.body.tags,
        post,
        tags;

    tags = _.map(tag_data, function(tag){

        return new tagModel( _. pick(tag, 'name' ) ); 
    });

    post_data. tags = tags;

    post = new postModel( post_data );

    post.saveAll({ tags : true }). then(function( post ){

        response.status(201).send(post);
    });
});

app.put('/posts/:post_id', function(request, response){

    var post_id = request.params.post_id,
        post_updates = _.pick(request.body, 'title', 'body'),
        tag_updates = request.body.tags;

    postModel.get(post_id).getJoin({ tags : true }).then(function(post){

        var tags;

        // At this point tags could contain updates to existing tags, have added new tags
        // or removed them.
            tags = _.map(tags, tag_data, function(tag_data){

                var tag = _.findWhere(post.tags, { id : tag_data.id });

                if ( tag_exists ) return _.extend( tag, tag_data );
                else return new tagModel(tag_data);
        });

        post.tags = tags;

        post.saveAll({ tags : true }).then(function(){ /*...*/ });
    });
});

To do this currently would require something like this code in the update controller.

app.put('/posts/:post_id', function(request, response){

    var post_id = request.params.post_id,
        post_updates = _.pick(request.body, 'title', 'body'),
        tag_updates = request.body.tags;

    postModel.get(post_id).getJoin({ tags : true }).then(function(post){

        var tags = [],
            removed_tags = [];

        // Update existing tags and collect removed tags...
        _.each(post.tags, function(tag){

            var tag_update = _.findWhere(tag_updates, { id : tag.id });

            if ( tag_update ) tags.push(_.extend( tag, tag_update );
            else removed_tags.push(tag);
        });

        // Add new tags.
        _.each(tag_updates, function(tag_update){

            var tag_exists = _.findWhere(tags, { id : tag_update.id });

            if ( tag_exists ) return;
            else tags.push( new tagModel( tag_update ) );
        });

        tagModel.getAll(_.pluck(removed_tags, 'id'), { index : 'id' }).delete().run().then(function(tags){

            postModel.get(post_id).getJoin({tags: true}).then(function(post){

                response.status(200).send(post);    
            });
        });
    });
});

It just seems to me that the using joins in this way is the common use case. Obviously having to make 3 db hits to do that update is a bit expensive when it can be avoided (although i'm sure you could solve this problem with ReQL directly. At the cost of loosing all thinky features).

Also I find it very rare that there would be many circumstances where one would want a hasMany/belongsTo relationship to exist but allow documents to exist where that relationship doesn't. e.g. the post_id field in tags is likely always required() in most cases that I can imagine.

AshHeskes commented 9 years ago

Actually thinking about it. Maybe a nice solution exists in the above.

If a joined documents joined field is required. Calling saveAll will delete that document when the relation is broken. Similarly if the joined field is not required it will save the document removing the joined field.

neumino commented 9 years ago

Fixed in 2.1.9

AshHeskes commented 9 years ago

Great. It would be great if you could clarify what the fix includes?

  1. Does it fix broken backlinks so deleting a joined document will no longer cause an error when calling saveAll on the parent document?
  2. Did you implement my suggestion of deleting a joined document when the parent is saved and the relation is broken because its joined field is required()?
  3. Anything else to look out for?

Thanks for your hard work :+1:

neumino commented 9 years ago
  1. Yep
  2. Nop
  3. Nop :)