reapit / foundations

Foundations platform mono repo
57 stars 21 forks source link

We are receiving multiple webhooks when a record is updated. #5229

Closed jennyCognio closed 2 years ago

jennyCognio commented 2 years ago

There are two issues that may be related.

  1. We are receiving multiple webhooks when a record (contact / applicant / property) is updated. Up to 15 within the space of a minute or two.

This doesn't happen every time, but when it does they can be within seconds of each other. Far faster than a user would be clicking save. Each one triggers an API call (sometimes multiple calls if we need to get embedded data). We're not sure if the webhooks are being received multiple times, or if they're firing each time an individual piece of data is updated on the record rather than when the save button is clicked. My concern is that if the webhooks are firing separately to the save button being clicked, the negotiators may not have wanted that data to be published / it may not be complete. Additionally, there may be cost implications that need to be discussed separately.

  1. API calls are being rate limited when the rate is being determined by the volume of webhooks being received.

When a contact / applicant / property is added/updated to Reapit the webhook triggers an API call so we can update the website/database. We've had several instance (the latest being around 12:40 11/10/21 ) where our API calls are returning the error "Request failed with status code 429". We got hundreds of errors within the space of 5 mins or so.

Ollie has said this may have been caused by a hosting issue, but it's happened several times and the rate limit should be set high enough to accept the API calls that are triggered by your own webhooks. Hopefully if point 1 is a bug, this will become a moot point. If point 1 isn't a bug, then this needs to be looked at.

Thanks Jenny

Specification

cbryanreapit commented 2 years ago

Hi @jennyCognio

Multiple webhook events

It is by design that that some update actions taken by a user will result in multiple webhooks being emitted.

For example, if a contact's name is updated, the platform will emit a contact.modified webhook event, as you would expect. At this point - subject to your app's webhook event subscriptions - the platform will also send modified webhooks for any applicant, vendor or landlord that the contact is associated to.

This happens because we include contact information as part of the payload/representation for associated applicant/vendor/landlord entities and therefore the platform also considers them to be modified, requiring a webhook to be emitted.

If you aren't already, I would also suggest you consider using the "Ignore notifications where only the eTag has been modified" webhook configuration setting, available in the webhook management area in the developer portal.

Rate limits

Your application must adhere to our documented rate limits regardless of how the request is triggered.

If your application is hitting 429s because you're synchronously responding to webhook events, you might consider adding a queueing mechanism into your architecture, allowing your app to receive webhook events at a different pace to processing them.

This approach has been adopted successfully by other developers on platform and allows your app to capture all incoming events but respond to them in a controlled fashion that can adhere to the rate limits of the platform.

jennyCognio commented 2 years ago

[image: image.gif]

On Tue, Oct 12, 2021 at 3:55 PM cbryanreapit @.***> wrote:

Hi @jennyCognio https://mailtrack.io/trace/link/db38a0316b10b05d473d99069d8df2ce791dde3e?url=https%3A%2F%2Fgithub.com%2FjennyCognio&userId=2673365&signature=c4b5e10881bd8fa6 Multiple webhook events

It is by design that that some update actions taken by a user will result in multiple webhooks being emitted.

For example, if a contact's name is updated, the platform will emit a contact.modified webhook event, as you would expect. At this point - subject to your app's webhook event subscriptions - the platform will also send modified webhooks for any applicant, vendor or landlord that the contact is associated to.

This happens because we include contact information as part of the payload/representation for associated applicant/vendor/landlord entities and therefore the platform also considers them to be modified, requiring a webhook to be emitted.

If you aren't already, I would also suggest you consider using the "Ignore notifications where only the eTag has been modified" webhook configuration setting, available in the webhook management area https://mailtrack.io/trace/link/31102206b2f63d72d98ee3f1c9a03b35aecf31c8?url=https%3A%2F%2Fdevelopers.reapit.cloud%2Fwebhooks%2Fmanage&userId=2673365&signature=6a476f655d9b0950 in the developer portal.

You misunderstand the issue - if we take your example, we're getting multiple contact.modified webhook events each time a record is updated. The associated record webhooks were already expected/managed

Rate limits

Your application must adhere to our documented rate limits https://mailtrack.io/trace/link/f2f073ca657547cbe3df06fabe35088139f07bbf?url=https%3A%2F%2Ffoundations-documentation.reapit.cloud%2Fapi%2Fapi-documentation%23rate-limits&userId=2673365&signature=e572762c800fcf7f regardless of how the request is triggered.

If your application is hitting 429s because you're synchronously responding to webhook events, you might consider adding a queueing mechanism into your architecture, allowing your app to receive webhook events at a different pace to processing them.

This approach has been adopted successfully by other developers on platform and allows your app to capture all incoming events but respond to them in a controlled fashion that can adhere to the rate limits of the platform.

The limits are being hit due to the above issue, I'd have expected the throttling to have been at your end if the system can't handle the same volume in.

You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://mailtrack.io/trace/link/9b96c6f314537b65b5736eaa466a0916f54b6f06?url=https%3A%2F%2Fgithub.com%2Freapit%2Ffoundations%2Fissues%2F5229%23issuecomment-941090563&userId=2673365&signature=4b1519b0cb9ebb45, or unsubscribe https://mailtrack.io/trace/link/7fb5c23ddb413322aa4b3be86d147d3bda34f80f?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAB7ANIQVJ2JX3SLK6GUCSN3UGRD7RANCNFSM5F2NUOHQ&userId=2673365&signature=ac5a9267f925f966 . Triage notifications on the go with GitHub Mobile for iOS https://mailtrack.io/trace/link/1be09a34a72342e1d0797ee733a4b58b15831f3d?url=https%3A%2F%2Fapps.apple.com%2Fapp%2Fapple-store%2Fid1477376905%3Fct%3Dnotification-email%26mt%3D8%26pt%3D524675&userId=2673365&signature=6379ba91e7516262 or Android https://mailtrack.io/trace/link/876be9737a463192935262397e98d7ff4a1b9d83?url=https%3A%2F%2Fplay.google.com%2Fstore%2Fapps%2Fdetails%3Fid%3Dcom.github.android%26referrer%3Dutm_campaign%253Dnotification-email%2526utm_medium%253Demail%2526utm_source%253Dgithub&userId=2673365&signature=63419cb0ca785b49.

cbryanreapit commented 2 years ago

You misunderstand the issue - if we take your example, we're getting multiple contact.modified webhook events each time a record is updated. The associated record webhooks were already expected/managed

Could you please provide the eventIds of a cluster of webhook events that are seemingly duplicated? With that information, we will be able to investigate those specific events further.

JonCognioDigital commented 2 years ago

Hi,

I thought I'd add some info here, just to clear up any confusion.

We're not sure if anything is duplicated, it's just that sometimes the same property or contact seems to be updated many times in quick succession so we're querying what the trigger for a webhook is. Where we're seeing 3 webhooks called in the space of a few seconds for one property (see log below) it seems that this can't be a negotiator in branch saving the property 3 times that quickly. We're wondering if the webhook triggers each time the user changes a field or moves from one textbox/control to another rather than waiting until they click save? We've had instances where a property is updated 20 times in the space of 20 minutes and it doesn't feel right.

Unfortunately we don't log event_ids on our end at the moment but we do have a log of dates/times/propertyIDs which I can send you separately. Here's an example of a property being updated 8 times in a row this morning. It's possible that the user is going in and out of the property and saving it multiple times but the timestamps are fairly close together.

image

A better example may be this one...

image

You can see that there were 3 webhook calls this morning at 10:54 and 30/34/41 seconds. maybe the user didn't think it had saved so hit the save button multiple times within seconds? It would be good to know if there's another explanation though as this does seem to happen a lot and it's not uncommon for us to see a stream of logs for the same property or contact. I think the biggest we've seen is over 50 updates to a property in an hour.

Many thanks.

plittlewood-rpt commented 2 years ago

Hi @jon64digital thanks for this information. Can I just confirm which customer this is for, and the id/type of entity that your example came from?

JonCognioDigital commented 2 years ago

Hi there.

It's when somebody updates a property, it seems that either they're updating the property many times or the webhook is being called multiple times for the same update. It's possible they're just pressing the save button many times or that the webhook is called before they actually submit it?

plittlewood-rpt commented 2 years ago

Hi @jon64digital if you can give me the webhook ids from your screenshot (we submit them all with a guid) I can track them down and try and unpick the order of events that resulted in them being fired.

JonCognioDigital commented 2 years ago

Ah, sorry, we don't log those.

All we'd like to know is what the trigger is for calling the property updated webhook. That's probably easier than trying to hunt down what's happened with specific calls.

plittlewood-rpt commented 2 years ago

The properties.modified webhook only gets triggered by someone actually updating the property record itself. Some other topics have events which cascade (ie if a contact surname gets updated, that would cascade events for applicants.modified/landlords.modified/vendors.modified where appropriate, as we surface the contact name on those other models), but for properties it's a pretty standard 1-1 relationship so the behaviour you describe sounds slightly odd. We do have a retry mechanism in place which will retry transmission up to 6 times with an exponential backoff if we don't get a 200 response from your endpoint fast enough. A few developers have fallen foul of this in the past where they haven't built asynchronous processing, so even though they are processing the message, if it took a certain amount of time the event would end up being retried. We normally pick up on this on our monitoring platform though and I haven't seen anything that looks like one of yours, so I'd like to understand what's going on here. I'll look at logs this end to try and tie it up with your example and go from there

JonCognioDigital commented 2 years ago

Thanks very much. I have a feeling that the answer is that they are simply going back in and editing the record a lot of times, or saving often because they don't want to lose their progress filling it in. The fact that this happens over the course of 20 minutes rather than 20 seconds would point to that.

plittlewood-rpt commented 2 years ago

I'll let you know if I find anything. Incidentally, if it's specific changes on a record you're interested in it might be worth you logging a feature request for a more granular webhook topic (you'll see there are various topics that deal with very specific changes to a record). This would allow you to cut down on any unnecessary noise. We also have a long term ticket #4289 , not yet scheduled or scoped out, for allowing developers to set their own change detection schema so you can fully control which changes you get transmitted. Feel free to comment on this ticket too as the more people asking for it the more likely it is to get more attention when we refine our backlog.

github-actions[bot] commented 2 years ago

We need to research or gather more information relating to this request. We have moved this issue into our ‘To review’ column whilst we obtain the information required. For more information on our processes, please click here

plittlewood-rpt commented 2 years ago

Moved back into To Review as we need to spend some time running tests on the webhooks pipeline to try and replicate this behaviour. From here we may be able to get a better idea of whats gone on. PL has some log output which confirms the behaviour that the reporting user has experienced which may help diagnosing the problem

plittlewood-rpt commented 2 years ago

@jennyCognio @jon64digital is it fair to say that this issue is the same as the one describe in #6651? If so I will close this ticket

plittlewood-rpt commented 2 years ago

Closing this ticket as it's essentially the same as #6651

github-actions[bot] commented 2 years ago

It looks like you have commented on a closed issue. If your comment relates to a bug or feature request, please open a new issue, and include this issue number/url for reference. For more information on our processes, please click here