Open malcomm opened 5 years ago
@malcomm You can attach a lambda function to a DynamoDB stream and perform your event-based business logic in that Lamdba function. The api category exports the stream arn of every table created by a @model directive which you can use to subscribe to changes on those tables.
You could use pipeline functions as well to execute the publish logic from within AppSync but until pipeline functions are fully supported via the api category, this would require custom resources(resolvers and stacks).
@kaustavghosh06 - thanks for the suggestion. I've added a simple DynamoDB trigger that hooks up to a Lambda function I added. This looks great; however, there's one key thing missing: the user. I was thinking that the user identity for the mutation would be on the context, but I'm only seeing this:
2019-07-26T00:28:46.244Z a4847658-23f8-48d9-9aea-2160636490d8 Context: { callbackWaitsForEmptyEventLoop: [Getter/Setter],
done: [Function: done],
succeed: [Function: succeed],
fail: [Function: fail],
logGroupName: '/aws/lambda/LogTableChange',
logStreamName: '2019/07/26/[$LATEST]279a5cd878594842ba6a4b7b70d1e13b',
functionName: 'LogTableChange',
memoryLimitInMB: '128',
functionVersion: '$LATEST',
getRemainingTimeInMillis: [Function: getRemainingTimeInMillis],
invokeid: 'a4847658-23f8-48d9-9aea-2160636490d8',
awsRequestId: 'a4847658-23f8-48d9-9aea-2160636490d8',
invokedFunctionArn: 'arn:aws:lambda:us-west-2:626912780862:function:LogTableChange' }
And I'm not seeing the user identity on the event object either.
I see Lambda documentation that indicates that this method can be used to provide a to create a permanent audit trail of write activity in your table.
:
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Streams.Lambda.html
How can this be a full audit trail if I don't log the username and the source (the identity)?
Am I doing something wrong? Do I need to enable something to get the identity information over to Lambda?
Any updates on this? This is kinda a game changer for me, because I need the ability to audit a user's action.
At this point, I think I'm just gonna have to roll my own.
@malcomm off the top of my head, context is the wrong place to look. That's your Node environment context, not the context of your event. You want to look in the first position argument, the event object.
When you say identity do you need the sub or the cognito:username attribute?
@jkeys-ecg-nmsu - I've looked in both the context and event and I'm not finding anything. The sub would be great but the cognito:username is a bare minimum. At this point I just need something to identify the user (IP address, cognito:username, mac address ... etc.)
Thanks.
Any help on this? I'm really blocked at this point and it is going to impact a production date.
It looks like there's no easy way to get the user identity information using a DynamoDB trigger. Can I add a hook somewhere that will basically wrap the normal mutation call but send the logs to somewhere where they can be viewed?
@malcomm Are you using the GraphQL transformer to spin up your AppSync APII? If yes, the transformer automatically adds an owner field to your model which can help you track the user.
@kaustavghosh06 - Yes I am using the GraphQL transformer to manage the AppSync API. When you say automatically, that sounds very nice, but I don't see an owner field anywhere in the data (logs or table).
Amplify version:
> amplify --version
1.7.6
I can't use the latest because of the aws-amplify/amplify-cli#922. Not sure if this is even related to version of amplify-cli.
Do I have to do anything special to get this owner
field to show up?
@malcomm I take that back. You can have a mutation and add an owner field from your client app and have auth rules for that field/model. You can define something like the following.
mutation CreateDraft {
createDraft(input: { title: "A new draft" }) {
id
title
owner
}
}
{
"data": {
"createDraft": {
"id": "...",
"title": "A new draft",
"owner": "someuser@my-domain.com"
}
}
}
@kaustavghosh06 - Well ... I could add a whole identity field and add in all kinds of info in there, but ... doesn't that rely on the client to set the correct thing? From what I'm thinking that would be a client only solution and that is very prone to error and can be modified by a malicious attacker.
I guess what I'm saying is that I think I need a server side solution that is secure and happens automatically. If we could add an @identity
on our models and it would automatically put in the identity information (username, IP address, maybe session info ... etc.) into a field, that would be great. That way, when I get the event on my Lambda trigger, all of that would be there.
Another way, would be to have this information automatically forwarded and placed on the event to be consumed downstream.
I'm open to whatever works.
Thank you.
@malcomm How does your schema look like? And for the function you've mentioned out here - aws-amplify/amplify-category-api#404, that's the trigger function right and not a @function resolver, correct?
@kaustavghosh06 - Correct, that information is coming from the DynamoDB Lambda (via the trigger).
Here's a section of my schema:
type StudyEncounter
@model
@auth(rules: [
{ allow: groups, groups: ["admin"] },
{ allow: groups, groupsField: "groupsCanAccess" }
])
@key(name: "StudyEncounterSubjectId", fields: ["studyEncounterSubjectId"], queryField: "encounterSubjectId")
{
id: ID!,
studyEncounterSubjectId: ID!,
...
groupsCanAccess: [String]
}
I currently do not have an owner or a field for identity.
@malcomm Did you checkout the context object available in your in the resolver? You can modify your auto-generated resolver to add the user information to DDB. You can find reference for the context object available in a resolver out here - https://docs.aws.amazon.com/appsync/latest/devguide/resolver-context-reference.html
The context object in the resolver has the following information:
{
"sub" : "uuid",
"issuer" : "string",
"username" : "string"
"claims" : { ... },
"sourceIp" : ["x.x.x.x"],
"defaultAuthStrategy" : "string"
}
@kaustavghosh06 - If we're talking about the Lambda function that's configured on the DynamoDB table, I looked at the event and context object and neither had user identity.
But something you wrote got me thinking. In my schema, could I define an identity
field and use an @function
to grab the user's identity info?
No, I’m talking about the actual VTL resolver and the context object available to it.
I don't have a custom VTL resolver for this. Honestly, it's unclear to how to add one that works with the framework. Any help on that?
The GraphQL transformer auto-generates the resolvers for you based on hyour schema, but if you want something custom -like your use-case, you can override the auto-generated resolvers located in youramplify/backend/api/<api-name>/build/resolvers
directory.
Refer to this documentation - https://aws-amplify.github.io/docs/cli-toolchain/graphql#custom-resolvers for learning more about using/implementing custom resolvers.
Also, overwriting your auto-generated resolver according to this doc - https://aws-amplify.github.io/docs/cli-toolchain/graphql#overwriting-resolvers should help in your case.
@kaustavghosh06 - OK so just to be sure I'm doing this right. For my resource StudyEncounter, I would need to add a new field for the user or identity and then I would need to put the following two files into amplify/backend/api/<api-name>/resolvers
:
I am assuming that I copy both of those files from amplify/backend/api/<api-name>/build/resolvers
and modify accordingly? Basically set the identity
field to what I want?
Correct.
@kaustavghosh06 - OK so I'm no expert at VTL, I admit that (first time with it actually). But I'm rather confused by the results I'm getting. I copied over Mutation.updateStudyEncounter.req.vtl
and I'm trying to figure out where to set the my new field that is called identity
. I look at the section where updatedAt
and __typename
are being set and I'm thinking that's a good place to start. So I do something like this:
...
## Automatically set the updatedAt timestamp. **
$util.qr($context.args.input.put("updatedAt", $util.defaultIfNull($ctx.args.input.updatedAt, $util.time.nowISO8601())))
$util.qr($context.args.input.put("__typename", "StudyEncounter"))
$util.qr($context.args.input.put("identity", $ctx.identity))
...
No matter what I do, identity
ends up being NULL
. I tried $context.identity
and even the entire $ctx
... all NULL
.
Am I just doing this wrong? Why is $ctx NULL?
And just to be sure, my custom resolver is being utilized, because I'm getting this error:
ERROR Error: Uncaught (in promise): Object: {"data":{"updateStudyEncounter":null},"errors":[{"path":["updateStudyEncounter"],"data":null,"errorType":"MappingTemplate","errorInfo":null,"locations":[{"line":2,"column":3,"sourceName":null}],"message":"Expected JSON object for attribute value '$[update][expressionValues][:identity]' but got 'NULL' instead."}]}
at resolvePromise (zone.js:852)
at zone.js:762
at rejected (tslib.es6.js:69)
at ZoneDelegate.push../node_modules/zone.js/dist/zone.js.ZoneDelegate.invoke (zone.js:391)
at Object.onInvoke (core.js:26769)
at ZoneDelegate.push../node_modules/zone.js/dist/zone.js.ZoneDelegate.invoke (zone.js:390)
at Zone.push../node_modules/zone.js/dist/zone.js.Zone.run (zone.js:150)
at zone.js:910
at ZoneDelegate.push../node_modules/zone.js/dist/zone.js.ZoneDelegate.invokeTask (zone.js:423)
at Object.onInvokeTask (core.js:26760)
OK I think I've got it ... my editor was clobbering end parentheses ... basically I was missing a ")" and the error message looked like I was just getting NULLs.
@malcomm Glad you had that figured out. Did you get all the idenity info that you needed?
Also of note ... I had to do this:
$util.qr($context.args.input.put("identity", $util.toJson($ctx.identity)))
Without the: $util.toJson
Call ... nothing works. I guess the "put" is only able to handle a String.
Also, I was trying to store this data as an AWSJSON. I was probably doing something wrong, but the documentation is not great and I could not get that to work very well at all. I tried the $util.toJson
and things got strange when pulled out of the DB. Also, I tried this:
$util.qr($context.args.input.put("identity", $util.dynamodb.toDynamoDBJson($ctx.identity)))
That came back null ... anyway, I think after many, many hours I finally have the identity being stored on a single table .... so very painful. Honestly ... this is just something that should be handled, but yeah ....
@malcomm If you protect your model with an auth role with an owner autorization, we auto-populate the table with user info, but since you didn't have it, that's why you'd to use custom resolvers and deal with VTL. We're soon releasing ocal testing for your AppSync API's and resolvers as a part of Amlify CLI to make it easy to debug your API's - including your VTL resolver code.
@kaustavghosh06 - I got the data ... this is far from ideal, but it might work for my solution.
My 2 cents: something needs to change here to help people out. Trying to audit a user's actions should be very very easy. What I have now is a hacked up bandaid that might work ... I mean taking a step back from all this, I very much doubt I am the only one that needs to be able to audit a user's actions. I would really like to see something first-level to support this.
@malcomm You are always able to use AWS CloudTrail to audit API calls made against your AWS account.
Otherwise, we are working towards making it easier to add pipeline functionality to API projects that would enable use cases like this. In the future the goal is to be able to create a function named "Audit" and then make it easy to compose that function into any mutation that you want to audit. Do you agree that generalized support for pipeline functions would help in this situation?
Also to clear any confusion about the VTL mentioned above here is some more explanation. This is a simplified version of the default createX mutation.
## We can add identity information by setting the key in the input **
## This works because we call $util.dynamodb.toMapValuesJson($ctx.args.input) below. **
## This will store all information contained in the JWT as a Map **
## in a single DynamoDB attribute named identity **
$util.qr($ctx.args.input.put("identity", $ctx.identity.claims))
{
"version": "2017-02-28",
"operation": "PutItem",
"key": {
"id": $util.dynamodb.toDynamoDBJson($util.autoId()),
},
"attributeValues": $util.dynamodb.toMapValuesJson($ctx.args.input),
"condition": {
"expression": "attribute_not_exists(#id)",
"expressionNames": {
"#id": "id",
},
},
}
Once you have added the new identity field, you can add support in the schema pretty easily as well.
type Identity {
sub: String
iss: String
# etc.
}
type Post {
id: ID!
title: String!
identity: Identity
}
You can then list Posts:
query listPosts {
listPosts {
items {
id
title
identity {
sub
}
}
}
}
And get back results that look like:
{
"id": "7a29036a-e3df-458b-b966-fa9e4d6d5ae4",
"title": "Hello, world!",
"identity": {
"sub": "c24611ed-91ba-4a63-a591-74576a346be3"
}
}
@mikeparisstuff - I was looking at CloudTrail and at first it looked great, but it seems to be geared towards just auditing the administration ... not the actual use of the API. I could not find a way to make that work. Am I just missing a setting?
Just to be sure, this would need to log events (mutations in this case) to CloudTrail for users logged in via Cognito. I would be really happy if that was the case.
But honestly, CloudTrail is exactly what I am looking, it gives me all the tools that I need. If this can be used for the general use of the API ... that would be so great. +1000 for this.
@malcomm CloudTrail would be able to tell you when calls are made to your data sources but the identity from cloudtrail will be that of the role that AppSync assumes when calling your DynamoDB table. In other words they are not specific to your logged in Cognito users as you correctly called out.
I have updated my answer above to give a working example for what you are trying to do. Hopefully this helps clarify things.
@mikeparisstuff - technically I understand your answer ... but stepping back again ... how does this make sense? I have to say, having an audit trail (CloudTrail) lose the context of what user performed the action ... I can't understand how that makes sense.
My 2 cents: appsync should "stash" the user identity information to be used for all logging and for CloudTrail. Make this the "source identity" or something, because both are important.
@malcomm I don't disagree with you that this is useful but this lies a bit outside of the traditional flow when using CloudTrail. CloudTrail keeps an audit log of all activities performed against your AWS resources and, in general, requests are signed with a SigV4 signature which CloudTrail uses to pull identity information out of.
I have to say, having an audit trail (CloudTrail) lose the context of what user performed the action ... I can't understand how that makes sense.
The question here is what is the context of what user performed the action.. From AppSync's perspective it is aware of the Cognito User Pool, OIDC endpoint, etc. From CloudTrail's perspective everything is a SigV4 signed call. When making a call to a data source, AppSync assumes a role in your account and is able to use that role to sign a request to send to DynamoDB on your behalf. CloudTrail is able to pick up on this and understands the "user" to be the IAM role that was used to actually sign and issue the request to AWS.
I will need to investigate if it is possible to add custom identifying information to CloudTrail as you are requesting but this would be a longer term enhancement. In the meantime, you have the ability save custom identification information such as attributes in your JWTs using resolvers in AppSync.
@mikeparisstuff - thank you for looking into this. Not sure if this helps with the priority or not, but I'm looking at this:
https://docs.aws.amazon.com/appsync/latest/devguide/cloudtrail-logging.html
AWS AppSync is integrated with AWS CloudTrail, a service that provides a record of actions taken by a user, role, or an AWS service in AWS AppSync.
Per that, I would say that this violates the contract of that documentation. I would also say, that it violates the KISS principal ... that is, from a customer of these services, I would not expect CloudTrail to not log the user that actually initiated the call.
@kaustavghosh06 or @mikeparisstuff - I see that this got moved to a feature-request. I'm trying to plan for a production date and I'm trying to see if I need to put in place an interim fix for this or if these changes could be done before my date. Any idea on the time horizon for this feature?
Also, I'm looking at the information that is in CloudTrail and I don't seem to be seeing the events that pertain to AppSync/DynamoDB mutations. From my Lambda that's trapping DynamoDB, I'm seeing this:
2019-08-01T21:22:01.172Z 99ead3e7-4f57-421e-8d18-77fa13312b6b Received event:
{
"Records": [
{
"eventID": "29dd1b8e3ef95319f7eedd4d8a54d3ba",
"eventName": "MODIFY",
"eventVersion": "1.1",
"eventSource": "aws:dynamodb",
"awsRegion": "us-west-2",
"dynamodb": {
When I go over to CloudTrail, I'm not seeing this event ID (29dd1b8e3ef95319f7eedd4d8a54d3ba) anywhere. I also do not see any event related to AppSync/DynamoDB mutations at all.
I'm logged in as Administrator and using CloudTrail. I'm assuming that the admin account has access to all user information here?
@kaustavghosh06 or @mikeparisstuff - any updates on this? Thank you
@kaustavghosh06 / @mikeparisstuff - I know this is marked a feature-request ... but isn't this more of a bug? (because the system is not acting as it should?)
@malcomm we're interested in doing something very similar. I'm curious to know where you ended up. My gut is telling me to wait until pipeline resolvers (aws-amplify/amplify-category-api#430) are available.
@kaustavghosh06 / @mikeparisstuff - any updates on this?
any further updates on this @mikeparisstuff ?
+1 .. in general it seems like audit is a popular feature ask but not yet supported.
@kaustavghosh06 / @mikeparisstuff - it's been over a year since I first submitted this and I was hoping to have any indication on whether or not this is going to get any support?
@kaustavghosh06 / @mikeparisstuff - just putting another ping on this. Would love to know if this is going to get support or not?
+1
+1
Hi guys, We needed this feature and couldn't wait for it, so we implemented our own transformer. Firehose can be very useful in this case, and much more. It's open-source so feel free to use and contribute. https://github.com/LaugnaHealth/graphql-firehose-transformer
@dror-laguna , that is great , do you know if it will work if you are already using lambda resolvers for some fields?
@dror-laguna , that is great , do you know if it will work if you are already using lambda resolvers for some fields?
hi @tgjorgoski , yes it should work with a @function resolver, we test it but we don't use it much... so i can't be sure 100%
@malcomm did you end up implementing this on your own? I've had several attempts during last year but never managed to work around it with proper and scalable solution.
Did amplify team communicate in any way regarding this in other threads maybe?
@malcomm did you end up implementing this on your own? I've had several attempts during last year but never managed to work around it with proper and scalable solution.
Did amplify team communicate in any way regarding this in other threads maybe?
@konradkukier2 we implemented and open source it aws-amplify/amplify-category-api#404
@dror-laguna Thanks a lot! I've seen it just today and we're already planning the time to give it a try next sprints. Fingers crossed :crossed_fingers:
Which Category is your question related to? Amplify-cli / Appsync
What AWS Services are you utilizing? api / graphql, hosting, DynamoDB, cognito
Provide additional details e.g. code snippets
I asked a similar question here:
https://forums.aws.amazon.com/thread.jspa?threadID=305061&tstart=0
But, i feel like I'm not really getting a great answer specific to my application's setup. I'm using amplify/appsync and trying to figure out a good way to handle this generically. I have a handful of resources that I need to audit.
I am trying to find an effective way to track changes performed using AWS AppSync. By audit or track, what I mean by that is:
I have a need to track what a user does in the system. So at a minimum:
who (username) what (mutation) when where (source IP or other info) why (this is more than likely gonna be entered by the user)
I'm sure there is more, but if I can get that that would be huge.
The suggestion is to add pipeline resolvers ... but I'm having trouble figuring out how to manage that with my
schema.graphql
. Some questions:Do I just add files to my
resolvers
&stacks
? How do I just add my pipeline and still utilize the standard resolvers? Is there a way to write a generic pipeline resolver that will handle all mutations in the system?I have a vague idea of how to implement this (documentation is not clear), but I'm also thinking that this is the wrong tool. It feels like this kind of thing should be audited at a different level.
Any help is greatly appreciated.