Closed jofftiquez closed 4 years ago
Another info that might be helpful. There are 2 of us using the cli on the same stripe test account. Both running on same app url http://localhost:7500
but running on different computers and network.
We found out that somehow the events from machine 1 is being accommodated by machine 2. Weird.
Hey @jofftiquez! This isn't a technically a bug and the behavior isn't specific to the CLI. We do not provide explicit ordering for webhook events so it's possible for an updated
event to arrive before the corresponding created
one (in production too). Check out https://stripe.com/docs/webhooks/best-practices#event-ordering for more information.
We found out that somehow the events from machine 1 is being accommodated by machine 2.
This is how it works in production too so the CLI is mimicking that behavior. We made the CLI mimicking production behavior as much as possible so that there aren't surprises when you go live :)
Hey @tomer-stripe thanks! Got it.
@tomer-stripe How is this not considered a bug? I am seeing the same issue, and it is wreaking havoc with my integration. If I can't rely on the subscription.created > subscription.updated > session.completed events coming in that order, how do I know what the current status is of my subscription? Note that those events also need to be synchronous - subscription.updated shouldn't be sent until Stripe receives a 200 response from the subscription.created webhook (same thing with session.completed waiting for subscription.updated)... otherwise, this could lead to a race condition on my server.
I am relying on the subscription status field, and if the last webhook event that I receive happens to be subscription.created, it will have a status of "incomplete" which will overwrite the status of "active" sent by the subscription.updated event that arrived first.
So the solution I came up with is to completely ignore the subscription data that is posted when my webhook handler is called. Rather than relying on that data, I use the subscription ID to call the subscriptions.retrieve API to get the current subscription data. THAT data is up-to-date, so I don't have to worry about the order of the subscription and session events arriving.
I think the documentation on handling session completion is misleading. Under "webhooks" on this page https://stripe.com/docs/payments/checkout#payment-success it reads:
Stripe sends the checkout.session.completed event for a successful Checkout payment. The webhook payload includes the Checkout Session object, which contains information about the Customer, PaymentIntent, or Subscription,
While this may be technically correct, it implies that the whole process has completed, when in actuality the corresponding events may not have yet been sent. I think that should be clarified to avoid confusion.
Is there a more robust solution than refetching? Receiving multiple webhooks will trigger multiple refetches, and then I'm just hoping the resource has settled by the time I trigger the refetches. If it hasn't, then since the refetches will complete in non-deterministic order, I still have a race condition.
I can find some way to ensure my (distributed) app only does the refetch for one webhook at a time, but engineering my own solution here seems like a lot just to know if a subscription is ready for provisioning.
I think this was a little miss leading as well but each event object contains a created
timestamp. Checking the events it seems like this created
filed is accurate. @tomer-stripe can you confirm this?
I'm guessing the best solution is to store the created
timestamp and only update if the new event happened after the last event. This can be a simple upsert
style update which a check on this column if you're using traditional db like postgres.
calling retrieve is probably the simplest solution though.
I'm guessing the best solution is to store the created timestamp and only update if the new event happened after the last event.
In some of my tests, multiple webhooks for a resource have come through with the same created
value. That seems not unlikely, since the field is relatively low resolution at whole seconds.
For now I've settled on treating webhooks as a signal to retrieve, using Redlock to ensure no single resource has parallel retrieve operations happening.
Running into this as well. First thinking that my webhooks may be too slow, and I need transactions on my database. Then I saw the events coming in the "wrong" order. Unfortunately, the created
field resolution is too low to use it as a workaround. I'm afraid of running into usage limits some day when fetching data from stripe on every event to receive the newest data.
Also running in this issue. Any other workarounds that the community found?
Since I'm using NATS Streaming Server with the node-nats-streaming
NPM package, I was thinking of emitting events to my data streaming system and have any listeners interested in those events reject or acknowledge the event data based on the business logic. That way I would have a fast way of rejecting the events and accepting them based on my logic, instead of relying on Stripe to resend the events or have the events in order. I also deeply believe that this is the mother of all workarounds
Stripe event -> Webhook case handler -> Emit NATS event -> Listeners reject / accept message
Stripe has a /events
endpoint you can poll, instead of relying on push-based webhooks. I'm considering adopting it, but I haven't verified what order it lists events in. I'm hoping that they're in a sorted order since they're delivered as an array without any worries about transmission order.
I don't know how a message bus could help determining the current version of an object. However, I came to the conclusion that Stripe needs to implement a timestamp precise enough to determine which version of an object is the current.
I came up with a two phase solution:
0
, I fetch object from REST API and just update object in database.To ensure no conflicting overwrite, I implemented a global lock on given object id. Luckily, stripe ids are globally unique.
The /events
endpoint does appear to return items in reverse-chronological order (at least for a given resource - I haven't verified overall order, but I would presume it'd show the same behavior). This behavior is not mentioned in the docs.
Here's what I get after passing test data through jq
:
I understand this is by 'design' as per the event-ordering docs I spent a whole day building webhooks that rely on the lifecycle of the subscription, but now I realise the order is not gaurenteed, I don't know what to do.
And polling or calling the API everytime the user logs seems like the wrong thing to do, whats the recommended solution here? Are consumers of the stripe API supposed to implement some form of events bus system and ordering the events by created
time?
I know the stripe CLI has nothing to do with this, but if someone in the team can re-direct us to the right place for this, that would be great!
whats the recommended solution here?
When I asked Stripe support they said to to just disregard each webhook's payload and refetch the resource the webhook is for. Looking around online, this seems to be the popular solution.
Are consumers of the stripe API supposed to implement some form of events bus system and ordering the events by
created
time?
Event timestamps are too low-resolution for recipients to order them by creation time: timestamps have whole-second resolution, and multiple related events can be generated within the same second.
Further, if timestamps were millisecond-denominated, there's a possibility we'd discover Stripe creates events within a single millisecond. There's still some risk of a data race (though you may be comfortable with it).
If you want guaranteed event ordering, you have two choices:
/events
, since Stripe seems like they have an internal oracle for event ordering that they don't expose to the public.@tomer-stripe This is super problematic for your clients. It would be possible to send them in order from your side, by sending requests to webhooks, one at a time, and by waiting for the 200 status code response (or a failure after a number of retries) and only then sending the next event. Instead you move the complexity to your clients.
The fact that it is recommended to ignore webhook request payload by your support and query API is a complete BS. There can still be a race condition, unless client ensures that only 1 request per given customer is processed globally. Did you try to implement a reliable solution using your webhooks yourself? It is a nightmare.
Finally, could you at least provide more fine-grained event samples (instead of seconds) such that your clients can reorder the events on their own?
Really sloppy design on your side, heavily disappointed and we are now strongly considering another payment processor.
The workaround that has worked for me without problems so far is to use a create
op for customer.subscription.created
, and upsert
for customer.subscription.updated
.
I have a unique key constraint on the Stripe subscription ID in my database. If creating a new record in response to customer.subscription.created
fails with a unique key constraint error, I just ignore it. I know the initial record could only have been created in response to a customer.subscription.updated
event.
It's not an ideal solution, but it's the best I could come up with. I'm watching this space for a better solution!
Here's the high-level TypeScript code:
const stripeWebhookEventHandlers: Records<
(e: Stripe.Event) => void | Promise<void>
> = {
'customer.subscription.created': handleSubscriptionEvent(async event => {
// Stripe doesn't guarantee the order in which webhooks are called, which
// means we may receive the `.create` event AFTTER receiving the `.update`
// event once the subscription is activated. We don't want to accidentally
// overwrite the subscription with an outdated event. that's why we'll use
// a `create` operation here, while using `upsert` elsewhere.
try {
await syncSubscriptionViaWebhook(event.data.object, 'create')
} catch (error) {
// if it's a unique key constraint error, it means we've already created
// the subscription via a more up-to-date (but out-of-order) event
if (isPrismaError(error, PrismaError.UniqueKeyConstraint)) {
return
}
throw error
}
}),
'customer.subscription.updated': handleSubscriptionEvent(async event => {
await syncSubscriptionViaWebhook(event.data.object, 'upsert')
}),
'customer.subscription.deleted': handleSubscriptionEvent(async event => {
await syncSubscriptionViaWebhook(event.data.object, 'upsert')
}),
}
I know this is an old issue but let me introduce what kind of "workaround" I've implemented on my end.
Note: Even if this "solution" provides a way of consistency it comes with multiplied cost of performance and resources overhead.
First of all, IMHO the events should be fully sequential. When state of an entity (customer, subscription, invoice) is involved each event of the sequence should be co-dependent on the previous one and the processor (Stripe in this case) must guarantee on their end that you receive them in the proper order. I still have doubts that they went with this asynchronous approach because of huge technical architecture issues. It's unacceptable to point in their documentation that ensuring event ordering is job of the integrator. See
Allowing multiple events on you server to be processed simultaneously even with refreshing the #StripeObject by ID from their SDK's is potentially is error prone and dangerous because you'll end up in huge percentage of cases with Unique Constraint Violation. This would happen because of the low time gap between the event triggering from Stripe's processor side.
Imagine following event sequence:
You have database table for storing the subscriptions. On your server web hook implementation you optimistically try to insert new subscription if does not exist each time an event is received. Due to low time gap in huge percentage of the cases there will be serious race conditions due to uniqueness. Moreover... if your application is designed to respond with errors for such cases, Stripe automatically will retry calling your server which I find as an overkill.
What we want to achieve:
Implementation:
create type stripe_event_status_enum as enum ('pending', 'processing', 'processed', 'failed');
create table stripe_event (
id serial primary key,
event_id varchar(255) not null constraint uq_stripe_event_id unique,
event_name varchar(255) not null,
payload json not null,
status stripe_webhook_event_queue_status_enum not null default 'pending',
created_at timestamp default not null,
executed_at timestamp
);
webhook.php
// verify stripe signature...
try {
$event = Stripe\Webhook\Webhook::constructEvent(
file_get_contents('php://input'),
$_SERVER['STRIPE_SIGNATURE'],
'your-stripe-server-secret-here'
);
$db->insert([
'event_id' => $event->id,
'event_name' => $event->type,
'payload' => $event->toJSON(),
'created_at' => (new DateTime)->setTimestamp($event->created)
]);
// respond with 200 (OK)
} catch (Stripe\Exception\SignatureVerificationException $verificationException) {
// respond with 400 (bad request)
}
worker.php
$stripeClient = new Stripe\StripeClient(/* your server secret key here */);
while (true) {
$retriesCount = 0;
$done = false;
while (!$done) {
$db->beginTransaction();
try {
$stmt = $db->prepare("
SELECT *
FROM stripe_event
WHERE executed_at IS NULL
AND status = 'pending'
ORDER BY id
LIMIT 1
FOR UPDATE
");
$stmt->execute();
$event = $stmt->fetch(\PDO::FETCH_ASSOC);
if (!$event) {
$db->commit();
$done = true;
continue;
}
if ($retriesCount > 2) {
$db->prepare("UPDATE stripe_event SET status = 'failed' WHERE id = :id")
->execute(['id' => $event['id']]);
$db->commit();
$done = true;
continue;
}
$db->prepare("UPDATE stripe_event SET status = 'processing', executed_at = now() WHERE id = :id")
->execute(['id' => $event['id']]);
$stripeEventObject = Stripe\Event::constructFrom(json_decode($event->getPayload(), true));
switch ($stripeEventObject->type) {
case Stripe\Event::INVOICE_CREATED:
case Stripe\Event::INVOICE_UPDATED:
case Stripe\Event::INVOICE_PAID:
case Stripe\Event::INVOICE_PAYMENT_FAILED:
handleInvoiceEvent($stripeEventObject);
break;
// list your events here for which you've subscribed for
default:
throw new \Exception('Unsupported event type.');
}
$db->prepare("UPDATE stripe_event SET status = 'processed' WHERE id = :id")
->execute(['id' => $event['id']]);
$db->commit();
$done = true;
} catch (\Throwable $exception) {
$db->rollback();
$retriesCount++;
}
}
}
function handleInvoiceEvent(Stripe\Event $stripeEvent) use ($stripeClient) {
$invoice = $stripeEvent->data->offsetGet('object');
// always fetch from Stripe API fresh entity
try {
$invoice = $stripeClient->invoices->retrieve($invoice->id);
} catch(Stripe\Exception\ApiErrorException $stripeException) {
// not found
if ($stripeException->getHttpStatus() === 404) {
$db->prepare('DELETE FROM invoice WHERE stripe_invoice_id = :stripe_invoice_id')
->execute(['stripe_invoice_id' => $invoice->id]);
return;
}
throw $stripeException;
}
$invoiceFromDbStmt = $db->prepare('SELECT * FROM invoice WHERE stripe_invoice_id = :stripe_invoice_id');
$invoiceFromDbStmt->execute(['stripe_invoice_id' => $invoice->id]);
$invoiceFromDb = $invoiceFromDbStmt->fetch(\PDO::FETCH_ASSOC);
if (!$invoiceFromDb) {
$db->prepare('INSERT INTO invoice (stripe_invoice_id, status) VALUES (:stripe_invoice_id, :status)')
->execute(['stripe_invoice_id' => $invoice->id, 'status' => $invoice->status]);
} else {
$db->prepare('UPDATE invoice SET status = :status')->execute(['status' => $invoice->status]);
}
}
Hi, also came into this topic while implementing a new subscription model and found no way to determine the order of events, making webhook data quite useless :( and only to be used as a trigger to fetch data from the Stripe side. Is that correct? any new solution around?
:: EDITED :: 2 cents
objective I want to SAVE in my DB all the events in the correct order. I don't want to use a pre-defined order of events ("customer.created goes before customer.updated" etc)
what we know
first conclusion
solution
damm
Since days I'm facing the same problem which gives me headache how to solve it in a proper way. Its really a nightmare, and kind-of sad, because I generally really like the Stripe system (Very reliable SDK, amazing Documentations and examples compared to many other companies/providers, great interface,....). But also I understand, that's kind of impossible to send multiple HTTP requests in a small time-frame and guarantee that they will arrive in the right order. The protocol is not made for it.
Currently I'm trying to implement a different approach by using a state machine for subscriptions where each event executes a defined transition. My hope is, that the event order will not matter since the transition will just fail if the subscription is in a wrong state. Because events are retried on Stripe side after a while, sooner or later the subscription moves into the right state. Not sure if it will work out, can't wrap my mind fully around it to plan it ahead in every detail.
Okay I may be overthinking this, but I wanted to see what some of you guys think about the solution below, still learning the ropes here.
I will be using PaymentIntent Service for my examples.
In my Database I will be saving every event that happens for a payment intent as well as having a up to date paymentintentrecord with the latest event id as foreign key.
When a Webhook is triggered in my API, I will fetch all events logged for the specific payment intent from my database, check to see if the incoming event from stripe has already been logged (In case of duplicates).
After checking for duplicate, I evaluate whether the created time on the incoming event, is newer than the latest event logged on the payment intent record, if so log the event, and update the PaymentIntentRecord to reflect that the new incoming event is the latest.
If the incoming event and the latest event have the same creation time, I then call the stripe /events API to get a list of all payment intent events with the same creation time, filter to show only the ones for that specific PaymentIntent then compare both the incoming event and the latest event attached to my databases PaymentIntent record with the index order of the results from the api call.
This Resolves same timed creation dates, and if the new incoming event or the one already attached to my PaymentIntentRecord is the latest.
Because of the /Events api request limit, and just traffic from the stripe api in general I didnt want to poll the /Events api every time, and instead only call it if a event with the same creation time matched the latest event logged. This should help reduce traffic. I dont have real data yet, but what im reading, 90% of the time the creation times shouldnt be the same?
I am still trying to adhere to the sending an OK Response instantly to Stripe, IS this really needed? Shouldnt I be able to do all of this logic and then send an OK response to stripe, just in case something on my END fails? I saw somewhere that the Stripe API would wait like 5 Minutes for a response before retrying, or a bad request. Shouldnt this be enough time to do any logic you need?
Any comments on this approach would be much appreciated.
What a mess for something that should be trivial, I'm 🤯
The perils of eventual consistency.
The more information we have the easier it is for us to help. Feel free to remove any sections that might not apply
Issue
Sometimes webhooks are being sent incorrectly. Specific events:
customer.subscription.created
customer.subscription.updated
This causes our system to record the subscription status incorrectly. In the stripe dashboard, the actual status of the subscription is already
active
, however our system recordedincomplete
because we received thecreated
event later that it'supdated
counterpart.I wonder if this will also happen in production?
Expected Behavior
Expected behavior would be:
customer.subscription.created
firstcustomer.subscription.updated
Steps to reproduce
Note network throttling might help. I have a slow internet.
Traceback
Correct (created then updated)
Wrong (updated then created)
Environment
ArchLinux