balmbees / dynamo-types

Typescript AWS DynamoDB ORM
231 stars 18 forks source link

unable to scan FilterExpression with HashPrimaryKey #24

Closed singhdeepme closed 6 years ago

singhdeepme commented 6 years ago

I am unable to scan record with filterExpresssion. eg : await test.primaryKey.scan({ limit:20 , FilterExpression: ' is_deleted = 0'});

breath103 commented 6 years ago

Yes, this is not supported for now for several reasons. But I'll eventually implement this for sure.

chillenious commented 6 years ago

This would be a good feature to have. Would you mind elaborate any reasons are you're not supporting this right now (other than, I assume, lack of time)?

breath103 commented 6 years ago

Sure. I mean, lack of time IS one of it but not the one i should list here i guess :)

Fundamentally, I'm convinced that avoiding filter-scan expression as much as possible is the best practice of using DynamoDB, and furthermore, you "SHOULD NOT" use filter-scan on a heavy-loaded production system.

As you guys all know, Scan-Filter can be really expensive. if you scan 1 million records to find 1 record that matches the filter, you're paying for "1 million records read", not the "1 record read".

Beside cost-effectiveness, this could break your system entirely by consuming too much-read capacity and DynamoDB auto scalling (assuming you're using it) won't catchup, So all the read operations would fail, or even worse, kill your web server by holding requests too long. (since dynamodb automatically retries those operations that failed by insufficient capacity. So even if it success eventually, it would take much longer)

So while I'm using this library and DynamoDB on massive scale production (meaning > 1k read / write capacity and wide range of fluctuation depends on traffic) I've successfully avoided using Scan-Filter by utilizing Secondary-Index, and honestly I'm sure that was the only way. And, that's the main reason I didn't made this feature. since a) Seemed dangerous, b) I didn't needed.

Also, you can implement this Scan-Filter on your own application layer by filtering records after "scan", and that won't be that much different from just DynamoDB scan filter with some parallelization, and network bandwidth.

But, Obviously, I don't represent every use cases of DynamoDB and i do think there are some cases that Scan-Filter can be handy. That's why i've called it "Eventually supported".

So, 1) If any of you guys want to add this, i'm more than welcome to receive PRs and review it :) 2) Please share your use-case of Scan-Filter on your projects (especially on "production")! that would be helpful for me to understand the use cases and build API that situation (or not) @chillenious @singhdeepme

chillenious commented 6 years ago

Thanks for your thoughtful response! I actually think it makes a lot of sense to protect users from accidental use. It wouldn't really be a problem in the case I intended it for, but at the same time, even for that it's better to stay safe and follow an alternative route.

Probably good to leave this unimplemented (if people want to, they can use the DynamoDB api directly), but I would suggest to take the FilterExpression out as it currently is confusing to have this in the function signature not backed by an actual implementation.

breath103 commented 6 years ago

yeah having that parameter doesn't work is clearly a flaw. Will push that change soon