Closed wojcikstefan closed 7 years ago
@thomasst pls review
I was initially thinking we would make the validate_bulk_update
default implementation to do the check, but either way is fine with me.
Yea, I hear ya, though I'd prefer to expose the validation method before we git the database. basically, it's a similar concept to what validate_request
does for single PUTs and POSTs. I think of the extra bulk_update_limit
check as something separate.
Some more thoughts on the validation logic:
validate_request
was called for the PUT request (with obj == None
initially), and you could implement your validation based on the view_method
?validate_bulk_update
could take a list of objs
as a parameter.@thomasst I looked into your suggestion and I don't like the outcome. I find it more confusing than the original implementation (i.e. the one from this PR). Few reasons why:
validate_request
w/o an obj
and then you call it again once per obj - this seems like a poor design, where a single method has many different roles.validate_request
would be inconsistent, because we'd call it before get_objects
for bulk update, and after get_object
for a single PUT.validate_request
's code differentiates properly between single PUT and bulk PUT requests.Given all of the above, I propose we move forward with this implementation.
Okay, I agree that the purpose of the current version of validate_request
should be to validate the input data for the given obj
that is being modified (or None
if we're creating an object). However, the naming of validate_request
is confusing since it implies that it's called once per request, and not once per object, so maybe the naming should be improved. Note that we're already doing a "hack" in the task view so we can read out the update dict and figure out if certain fields were supplied with the request and are being modified.
Secondly, after thinking about it, I'm a bit hesitant of adding another method-specific method (validate_bulk_update
) because it seems very specific to the bulk update (e.g. why not also have validate_update
/ validate_create
?) and can probably be implemented more generically. Also, in this PR you're implementing validation logic in get_objects
and not even using the method to validate the bulk update.
My proposals are:
def put
), or by e.g. overriding get_objects()
and checking for view_method == methods.BulkUpdate
.Yea, I agree adding that extra validation method only makes things more inconsistent. Given time constrains, I'll go with not introducing any new methods.
After giving this another look I'm seeing another important issue: get_objects
now no longer returns a queryset, and the bulk limit validation doesn't subclasses into account.
One of the nice things about mongorest since its beginning has been that you could override get_objects
on the resource and override the parent queryset, like this:
class MyResource(Resource):
def get_objects(self):
return super(MyResource, self).get_objects().filter(status='X')
This now breaks on multiple levels: 1) we can't call .filter()
anymore since the queryset is pre-evaluated, 2) even if we rewrite the method to not rely on a queryset, the bulk validation doesn't take the new filter into account.
I haven't fully thought it through yet, but would keeping get_objects
as a queryset, and implementing the length check in process_objects
work?
I don't think that's true. Only get_queryset
has that guarantee. get_objects
afaik returned a list of objects in most cases.
Primarily to minimize the effects of a poorly constructed request.
After this change, flask-mongorest will by default limit bulk updates to 1k documents. If more than that would be affected, a 400 response is returned.
This PR also introduces a method which you can use to validate the request before processing of a bulk update starts.