Closed matt-adigital closed 5 months ago
@matt-adigital Could you please let us know if you are using the paymentFormHtml()
for the payment form?
I can't seem to reproduce this issue.
Also are you using any special mutex driver like redis?
Hi @lukeholder thanks for responding.
We upgraded our stripe api version within the stripe account and yes we were using the following code for the payment gateway:
{% namespace cart.gateway.handle|commercePaymentFormNamespace %}
{{ cart.gateway.getPaymentFormHtml(params)|raw }}
{% endnamespace %}
This site is hosted on servd so it is using redis but we've not defined redis as a mutex driver within the app.php config file.
@matt-adigital
It turns out that default Redis doesn’t make a reliable globally distributed locking mechanism like we thought for ages without jumping through some hoops. https://redis.io/docs/manual/patterns/distributed-locks/
So much so that in Craft 4.6.0, we swapped the default mutex in Craft back to a new and improved database mutex.
It can manifest itself with race conditions and duplicated data like what you’re seeing.
Servd is currently using Redis, but we’ve been talking to Matt (the owner), and he’s got a plan for a fix he’s working on. You’ll need to reach out to him for timelines and specifics, but it looks like you’re already on 4.6.0, so at least you’ve got that part covered already.
Hi @angrybrad thank you for that explanation, all makes sense! We run monthly updates on this site and try to keep everything up to date where possible. We will hold off on the stripe upgrade for now then and start a dialogue with Matt at Servd for an indication of timeline etc. Cheers!
Hi @angrybrad ,
Servd applied the fix yesterday regarding the database mutex. We've tested the stripe upgrade on our staging site (Servd) and unfortunately are still experiencing the duplicate adjustments.
I spoke to Joe at Servd this morning and he confirmed it appears to be using the database mutex. He suggested the issue may not be down to the mutex.
@damonadigital an you reliably reproduce it with steps you can share?
@lukeholder yes
-- OR --
I am no longer seeing the php error when completing a payment after following the second set of steps.
@lukeholder I'm also experiencing this issue quite severely.
It was happening in Craft + Commerce + Stripe 3 just occasionally so @angrybrad suggested removing Redis and upgrading to C4 to get the improved database mutex. Now it's happening to roughly one in 4 orders.
I can reproduce it like @damonadigital can but not every time.
Here's an example I just made, with the same cart, front and back end, both pages refreshed before taking the screenshot. Once the cart is updated, the additional adjustments will be removed. In this case I just have one duplication but usually there are two.
HI @pixelmachine & @damonadigital
I have just tried to reproduce this (using the steps stated) with the latest Craft, Commerce and Stripe using the example templates and no matter how much I refresh the payment page I am never seeing duplicated adjustments.
I go back and forth to the payment page, change what is in my cart etc and still only see the expected number of adjustments.
Would either of you be willing to create a fresh install with one product in and set up shipping/taxes in the same way you have it in your project. Then if you can replicate with the example templates, send us the database we might be able to figure out what is happening.
We are obviously keen to get to the bottom of this issue so we can get your projects running smoothly without any problems.
Thank you for all the information, time and feedback you have already put into this.
Hey @nfourtythree seems like the issue is somehow related to the hosting environment, did you try it on Servd?
My live setup is using Linode instances provisioned by Ploi, with a load balancer. Bit of a hassle to setup so testing on Servd seems simpler.
The issue does not occur in my local dev and test environments (DDEV and Fortrabbit).
If you can't replicate it in a similar environment then I'll see about getting you an install. Also I've already discussed this a bit more with Brad via support if you want to check that.
Hi All
I have managed to create a test case where I can reliably reproduce this issue. As expected, it is a case of a race condition, we will look to get a fix for this in place as soon as possible.
This is just a note to keep everyone updated with the situation.
Thanks!
Hi @pixelmachine, @damonadigital & @matt-adigital
We have pushed a branch with fixes in for this issue and there is now a PR for it.
Before we merge the branch it would be great if you could give it a test on your local/staging environments to see if this fixes things for your project.
To try this fix, change your craftcms/commerce
requirement in your project's composer.json
to:
"require": {
"craftcms/commerce": "dev-bugfix/race-conditions-duplicate-adjustments#1b2df66d0cc988b25d08211e2db1bad33a0a46ce as 4.5.2",
"...": "..."
}
Then run composer update
.
And then let us know how you get on.
Thanks!
Thanks @nfourtythree testing now.
Hi @nfourtythree ,
Thanks for the fix. Unfortunately it's not worked for us, we are still getting duplicate adjustments.
Hi @damonadigital
Thank you for your response, sorry to hear this isn't working.
This feels like an environmental mutex related issue. In the branch I linked to we are now locking the saving of the order on requests to the payment controller pay action. This means (if the mutex lock is working correctly) that if a race condition of requests to that action that the saving (and therefore recalculation) of the order can only happen one at time.
Are you able to reproduce the issue with the following:
Thanks
Hey @nfourtythree I can also confirm that it hasn't fixed the issue for me.
I can only answer your first question at the moment: No I can't replicate the problem in my local or staging environment and never have been able to.
I'll try to test the others.
Hi @nfourtythree so I've done a lot of testing over the last couple of days and I think I've found the culprit at least for my site – Cloudflare.
I'm using Blitz for the non-shop parts of my site and the instructions for Blitz suggest using a page rule in Cloudflare set to 'cache everything'. With a clone of my site on the exact same system I was only able to reproduce the issue if the page rule was enabled.
Other relevant things
I also setup a version of the Spoke & Chain site, again on the same system, but wasn't able to reproduce the problem. On my original site the checkout is a single page and each step is updated using a Sprig component, which in turn may refresh the Stripe payment form.
I'm not sure why this should make any difference and if I get the chance I'll modify the Spoke & Chain site to see if using Sprig to update the cart changes anything. Looking at the requests I can't see why it should but that's really the only difference I can see.
One thing that is quite interesting is that I can see the error happening in real time in the Control Panel, and it occurs almost every single time but then often self-corrects:
Hope that's all clear!
It's worth mentioning that our site also uses Cloudflare and Blitz. Although we have only applied the "Cache everything" page rule to 'domain.co.uk/assets/*'
@damonadigital yeah so I think it's probably still related to Cloudflare but I spoke too soon with the page rules. I could no longer replicate it yesterday but it's still happened at least once today. I may set Cloudflare to full bypass and see if that helps.
Hi All
Thank you for your extra information.
In an attempt to keep things progression we have pushed up another fix which could solve the issue. Although it could solve it I don't believe it is the root cause.
If you have previously changed the composer requirement to point to the branch then you should be able to do a composer update
.
With all the extra information provided it seems likely that the the issue here is two requests being made at the same time due to some kind of combination of Sprig, Cloudflare, Blitz etc.
But with this, it is still a little confusing as to why the mutex lock is not preventing the behaviour you are seeing from happening.
I understand everyone is super busy but it would be great if you do get a chance to try out a test controller I have put together.
<?php
namespace modules\controllers;
use craft\web\Controller;
use yii\web\Response;
class MutexController extends Controller
{
protected array|bool|int $allowAnonymous = true;
public function actionTest(): Response
{
$response = new Response();
$content = 'Start: ' . floor(microtime(true)) . PHP_EOL;
if (!\Craft::$app->getMutex()->acquire('test', 10)) {
$content .= 'Failed: ' . floor(microtime(true)) . PHP_EOL;
} else {
$content .= 'Success: ' . floor(microtime(true)) . PHP_EOL;
sleep(2);
}
$response->content = $content;
\Craft::$app->getMutex()->release('test');
return $response;
}
}
With this you can then run the following command line command:
curl -s https://example.domain/actions/my-module/mutex/test & curl -s https://example.domain/actions/my-module/mutex/test; wait
You will need to replace the example.domain
and my-module
(depending on what you bootstrapped your custom module)
This test will output some micro times to show when the request started and succeeded. If everything is working correctly with the mutex we should see that the start times are the same (within tolerance) and the success times are 2 seconds apart. Here is an example output from my tests:
Start: 1711551857
Success: 1711551857
Start: 1711551857
Success: 1711551859
If there is no difference in success times this would suggest the mutex is not working correctly.
Please let me know how you get on it.
Hi @nfourtythree ,
Just a quick update to let you know we've tried the latest fix and still seeing the issue. I will try the test controller when I have a chance and let you know.
Thanks
Interestingly we are no longer seeing the issue with the fix in place, or at least I haven't been able to reproduce it and no live orders have encountered it.
I've just done the mutex controller test and these are the results:
Start: 1711637547 Success: 1711637547 Start: 1711637547 Success: 1711637549
Hi @damonadigital
Thank you for getting back to us with that. If the mutex is working and you have pulled the latest changes I am really stumped as to how this issue is still apparent.
Sorry to ask some simple questions:
Thanks
Hi @nfourtythree ,
Thanks
Hey @nfourtythree
Staging:
Start: 1712100092
Success: 1712100092
Start: 1712100095
Success: 1712100095
Live:
Start: 1712100349
Success: 1712100349
Start: 1712100349
Success: 1712100351
Hi
Just wanted to check in with how things are going?
@pixelmachine everything still working as expected?
@damonadigital still seeing issues?
Thanks
Hi @nfourtythree ,
Yes unfortunately still seeing issues. I did another test after confirming those lines were in the vendor folder.
@pixelmachine Did you end up setting Cloudflare to full bypass in order to fix the issue?
Thanks
@nfourtythree as far as I know we haven't had a single recurrence since your not-a-fix fix last week. I have my composer.json requiring "craftcms/commerce": "dev-bugfix/race-conditions-duplicate-adjustments as 4.5.2"
.
@damonadigital no we didn't, Nathaniel's fix last week essentially solved the problem as soon as it was applied.
We have Cloudflare proxy enabled and a pages rules set like:
*.domain.co.uk/* Cache Level: Cache Everything
*.domain.co.uk/shop/* Cache Level: Bypass
For Blitz we have the same sort of thing:
'includedUriPatterns' => [
[
'siteId' => 1,
'uriPattern' => '.*',
],
],
'excludedUriPatterns' => [
[
'siteId' => 1,
'uriPattern' => '^shop',
],
],
I haven't changed any Sprig components in the checkout.
@nfourtythree will the changes you made work their way into the main Commerce branch or should I expect to stick with the bugfix one for now?
Hi @pixelmachine
These changes were merged into Commerce and released as part of version 4.5.3
. Apologies if we hadn't made that clear, but you are now safe to update to that version and things continue to work as they have been.
@damonadigital did any of the changes that @pixelmachine mentioned help with your project?
Thanks!
Hi @nfourtythree ,
We still need to try applying similar settings to Cloudflare as pixelmachine has done. Once we've tried this I will update you.
Thanks
Hi @nfourtythree ,
The settings @pixelmachine specified in Cloudflare have worked and we are no longer encountering this issue.
Thanks both for your help and assistance.
Great news and thank you for updating us!
Description
Line item adjustments are all being duplicated, this only happens in the final step when the payment part is loaded, this was never a problem until upgrading from v3 to v4, payment is taken at the correct amount for the original value but the order is then recorded as a partial payment with these extra adjustments remaining in the order details when completed. If you load any other page on the front end then the duplicates are removed, so we added some js to do this for us on the payment page, this then meant we hit a php error and the order was not completed. Payment taken successfully but completion of order failed, screenshot attached. We've had to revert back to v3 due to this issue as we can't have all orders showing as partial payments.
Steps to reproduce
\\\\-- OR --
Additional info