aws-amplify / amplify-category-api

The AWS Amplify CLI is a toolchain for simplifying serverless web and mobile development. This plugin provides functionality for the API category, allowing for the creation and management of GraphQL and REST based backends for your amplify project.
https://docs.amplify.aws/
Apache License 2.0
89 stars 79 forks source link

Total counts in GraphQL queries #405

Open houmark opened 5 years ago

houmark commented 5 years ago

Is your feature request related to a problem? Please describe. I think it's way past due that Amplify supports total counts in GraphQL queries. Other GraphQL based platforms has this built in by simply adding totalCount (or similar) in the query and no matter the limit, they'll get back the total count in addition to the (filtered) data.

Describe the solution you'd like This should at least work for DynamoDB backed models and of course also for search based queries that pass by ElasticSearch.

Describe alternatives you've considered Making a Lambda function that is a field in each model using the @function directive, but since we are both using listItems and searchItems with filters added, the implementation is not simple as we have to reapply those filters on the lambda function for getting the correct count.

Making custom resolvers seems like another not very fun route and not very scaleable or maintainable, and once again, this should be a "out of the box one liner" to have available as a developer. With either a Lambda or some other custom resolver I'm looking at hours or days of development.

Additional context This is a must have feature and there's not really any workaround for displaying total counts for systems with many items — at least that I know of. I read several bug reports, but none of them seems to have a simple solution. That it has not yet been developer by AWS is beyond my understanding, as pulling out counts is one of the most common things to do when developing web apps.

sacrampton commented 3 years ago

Hi @duwerq - I was trying to say the same thing - must have miscommunicated. I mean you can't rely on auth in in your graphql schema - you need to make sure the request resolver (VTL) filters based on the same parameters as your auth. I use a lot of dynamic auth for example so make sure the elasticsearch query is exactly the same so the response resolver totals are the same (and also that search is not returning anything it shouldn't).

By the way, I find auth pretty unreliable across the board - particularly dynamic auth - so implement security in the response resolvers for DynamoDB too to accurately return what I need. For example, subscriptions don't (or didn't) work with dynamic auth.

This brings up another point, security in the DynamoDB resolvers is implemented in the response resolvers and they use a for loop for that. There is an inherent limitation in appSync of 1000 iterations in a for loop. So any count operation you would do with AppSync on DynamoDB is going to have to paginate 1,000 records at a time.

PeteDuncanson commented 3 years ago

This thread has now got the length where posters are repeating points covered way way way back in time. Can we either kill this one off or at least give us a glimmer of hope that it might actually get some response from the Amplify Core team? Leaving it just hanging like this for so many years is just painful.

h4rkl commented 3 years ago

Associating the word "impossible" for a query count in 2019 makes me cringe a bit. And more than that, it makes me wonder if selecting Amplify and all its (current) dependents was a very wrong choice.

The fact that DynamoDB does not do counts for its queries (besides a full table estimated count every ~6 hours) is simply a limitation the team working on DynamoDB should solve. Every single competitor to DynamoDB handles this without issues, so I'm sure those smart people can also come up with a solution that does not just benefit AppSync and Amplify users, but also people using DynamoDB directly. Maybe it will be near correct counts if millions and more precise when thousands like MySQL / InnoDB, and that would be way way better than having no clue whatsoever.

I am aware that using the nextToken I can make pagination but that paginator is somewhat less cool to look at from a UX perspective as I won't be able to show 1, 2, 3, 4....12 because I don't know how many pages I have. When someone wants to know how many items to we have fitting this filter, it cannot be that I have to pull them all out (only the id field) in the leanest way, and then count the array client-side?

I'm sure AWS compares themselves in some ways with other GraphQL services like Prisma etc. and they don't seem to have a problem supporting this.

This is a DynamoDB limitation. Attacking a solution on top of that for AppSync is the wrong angle, this needs to end on the table of AWS DynamoDB developers so they can come up with a sensible solution nearer to the root of the problem — everything else is a hack. Asking me to keep counts in a model/table myself when things update is even worse and not what you'd expect of a platform with an otherwise impressive feature set.

And if it's not possible for DynamoDB to solve this, then the Amplify / AppSync team should start considering built-in support for other major database players such as MongoDB, MySQL, Postgres, etc. so they are not being held down by a half-baked database that is backing the entire thing, but when that is considered, I am sure it looks way more interesting to just figure out a solution to counts and other minor limitations DynamoDB currently has.

I'm stopping dev on amplify about 5 days in for my project because of this. It's a basic function of any db, and the lockin with Dynamo isn't worth these issues. Very valuable info thanks :pray:

renebrandel commented 3 years ago

Hi folks - I wanted to follow-up on this thread to give you all more insights on where we're at with this issue and many other GraphQL transformer issues. Over the past several months, we've been laying the foundation to drastically accelerate our GraphQL feature & enhancement velocity. A new architecture detailed in our GraphQL Transformer vNext RFC aws-amplify/amplify-cli#6217. More updates will come in the next two months on a preview version for broader public testing.

Once we've delivered on that architecture revamp, we'll be looking into this issue as the top-most feature request priority. As highlighted in prior comments this problem is hard, especially considering scale, by-default best-practice security, and cost-efficiency but we are determined to solve it.

I also want to thank the entire community here for your passion and continuous feedback on this issue. Your energy and feedback is what motivated us to do the revamp of our GraphQL transformer in order to deliver features faster and provide more customizability to help make you more successful.

warlord987 commented 3 years ago

If someone is having a issue with pagination and filter on and looking a backend as a service then it would be a good idea to look at strapi. I was able to get my backend running in 1 day with graphql with almost all the filter you can think of with pagination with Goto page as well.

simpson commented 3 years ago

Also checkout Nhost.io, we are using that as our backend with some of our new frontend clients in Amplify and love it

renebrandel commented 3 years ago

Hey folks! We've included the @searchable enhancements to provide aggregation values (count, sum, avg, min, max) in the new GraphQL Transformer.

In the new GraphQL transformer, we add a user authorization filter in the OpenSearch query request. (In the past, filtering on the OpenSearch result set happened after the fact.)

Try the PREVIEW (do not use this in your production environment) with the following instructions: https://github.com/aws-amplify/amplify-cli/issues/6217#issuecomment-929677992

majirosstefan commented 3 years ago

But what about mentioning these issues with @searchable? I guess they are almost all related to costs and they did not go away, simply by closing them:

https://github.com/aws-amplify/amplify-cli/issues/3860 (closed for some reason) • https://github.com/aws-amplify/amplify-cli/issues/3561 (points to https://github.com/aws-amplify/amplify-cli/issues/1816) • aws-amplify/amplify-category-api#165 - Data Loss

Based on that it would seem that @searchable costs 70 dollars per month for each environment in the cloud, which is not quite friendly for developers who are working in small agencies, or devs who are creating just MVPs. (Even serious solo developers prefer to use at least 2 envs: dev & prod which would cost 140$ per month in total - is that correct, please?

Well, I am not sure what my question should look like... but really, we should pay AWS 140$ per month if we just need to get counts from the database in 2 environments in 2021 to run ES/Opensearch? Just for testing MVP in the real world?

Or should we fork the Amplify like this: https://github.com/starpebble/amplify-cli to target free Elastic instances, or modify it to ignore @searchable, or even route calls to potentially target different, more dev-and-cost-friendly services, like Algolia?

e.g. In Firestore (I guess it's one of the many Amplify competitors), you can use transactions & distributed counters - e.g. whenever you create 'Post', a property called 'totalPosts' will be updated(increased) somewhere else (e.g in the owner entity and/or in the table that hosts data for all Posts). The same update(decrease of the counter) can occur on delete operation.

From 2018 - https://aws.amazon.com/blogs/aws/new-amazon-dynamodb-transactions/, we can use transactions also with DynamoDB + from 2019, also in Appsync. But for some obviously secret reason, if you use

amplify mock api

command to run the API emulator locally, you are not allowed to use either transactWrite items, or batchWrites - features that were announced 3 years ago by AWS itself. To my surprise, the pull-request that solves that was created 1 year ago - and it's still not merged.

https://github.com/aws-amplify/amplify-cli/pull/5574 • aws-amplify/amplify-category-api#259

Also, @searchable is not compatible (yet) with mocking API locally, on the local machine:

https://github.com/aws-amplify/amplify-cli/issues/5981 (also, closed) • aws-amplify/amplify-category-api#309

Well, there is this RFC: https://github.com/aws-amplify/amplify-cli/issues/7546 (with the last comment on Jul 31) - but the problem is, that it's still RFC. I am convinced that local mocking that works somehow with @searchable should be a part of @searchable from the start.

In preview - https://github.com/aws-amplify/amplify-cli/issues/6217 - that you are also mentioning above, there is no mention of transactWrite, nor batchWrite that could be used for such counters. Or am I missing something?

renebrandel commented 3 years ago

@majirosstefan - All the things you mentioned are still on our radar. The new GraphQL Transformer is our first step. Many of the @searchable RFC items are actually included as part of the preview.

A more efficient VTL-based approach is still on our radar and we'll be working on it post the launch of the new GraphQL Transformer. While the Amplify team can't directly change the cost structure of OpenSearch service, we plan on allowing local mock of OpenSearch to help customer test locally first before deciding to deploy. Regarding the aws-amplify/amplify-category-api#165, we've now added warnings to the customer to follow the OpenSearch-provided best practices guidelines for production configurations.

GeorgeBellTMH commented 3 years ago

@renebrandel it might be useful to make an option such that we don't spin up new opensearch instances automatically when creating new environments...basically just redirect all the queries to "not-implemented in this environment" or something...this would save costs when many environments are being spun up for testing or development purposes...and again, the ideal way to do it would be to allow for putting all environments opensearch on one instance...I would rather have one huge open search instance than many small ones from a cost perspective...

multimeric commented 2 years ago

To those still interested in this issue, I've made a package that provides a @count directive that solves this issue in an idiomatic Amplify way: https://github.com/multimeric/AmplifyCountDirective.

biller-aivy commented 2 years ago

To those still interested in this issue, I've made a package that provides a @count directive that solves this issue in an idiomatic Amplify way: https://github.com/multimeric/AmplifyCountDirective.

so its not really clear, is it based on scans?

multimeric commented 2 years ago

Currently it uses scans, which is believe is also how listFoo queries work. It should be fairly easy to integrate with the @index directive to use Query over a GSI rather than Scan, but I'd like some help with that feature, in particular because I'm not using GSIs myself: https://github.com/multimeric/AmplifyCountDirective/issues/1

AhmadMraish commented 7 months ago

As of transformer v2, things has changed, a new directive has been implemented to add this functionality. Please refer to the official docs https://docs.amplify.aws/vue/build-a-backend/graphqlapi/search-and-result-aggregations/

biller-aivy commented 7 months ago

As of transformer v2, things has changed, a new directive has been implemented to add this functionality. Please refer to the official docs https://docs.amplify.aws/vue/build-a-backend/graphqlapi/search-and-result-aggregations/

What do you mean? Which one? @searchable ? This is not really a solution. I understand that this is an dynamoDB problem instead an amplify problem.

sacrampton commented 6 months ago

Hi there, I need to unsubscribe from these emails, as Stephen has passed away. I don't know how login for GitHub, so could you please help me unsubscribe him?

Any help would be appreciated.

Kind Regards Megan Hagar (wife of Stephen)

On Wed, 1 May 2024, 12:49 am biller-aivy, @.***> wrote:

As of transformer v2, things has changed, a new directive has been implemented to add this functionality. Please refer to the official docs https://docs.amplify.aws/vue/build-a-backend/graphqlapi/search-and-result-aggregations/

What do you mean? Which one? @searchable https://github.com/searchable ? This is not really a solution. I understand that this is an dynamoDB problem instead an amplify problem.

— Reply to this email directly, view it on GitHub https://github.com/aws-amplify/amplify-category-api/issues/405#issuecomment-2085551512, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABQROCCIDD4OBM4KGKSWZP3Y76VQVAVCNFSM5WGKT26KU5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TEMBYGU2TKMJVGEZA . You are receiving this because you were mentioned.Message ID: @.***>

renebrandel commented 6 months ago

Hi Megan,

I'm truly sorry for your loss. Our hearts go out to you and your family during this difficult time.

I'll do the best of my ability to turn off your email notifications from this thread but it'll likely not affect all other threads' notifications. However, GitHub does provide a policy in case a user passes away. This might offer a solution for unsubscribing from GitHub entirely.

Please know that I'm here to support you in any way I can.

Warm regards, René Brandel