openfoodfoundation / openfoodnetwork

Connect suppliers, distributors and consumers to trade local produce.
https://www.openfoodnetwork.org
GNU Affero General Public License v3.0
1.11k stars 718 forks source link

Products Sold Exceed Max Stock Level #5300

Closed lbwright22 closed 3 years ago

lbwright22 commented 4 years ago

Description

Large UK Hub reported that for their OC 20/04/20 : 1) at least two producers found orders for products in their order cycle summaries which exceeded the max. stock level they placed. These producers were Hester's Harvest and Hoe Made Acre 2) customer's received invoices containing products listed as purchased with zero quantity: R781345776 Asparagus Green (250g) Hester's Harvest R877041463 Sausages- usually frozen- Pork Honey Roast and Mustard 500g Trehill Farm 3) producer found products were left with negative stock levels after the order cycle (Asparagus for Hester's Harvest had stock of -3: initial stock level was 50, OC supplier report says 53 were sold)

There may be other examples not yet found.

Expected Behavior

Product with finite stock level should not be available for sale once maximum capacity has been reached.

Actual Behaviour

Finite stock level ('in stock') exceeded during an order cycle.

Steps to Reproduce

1. 2. 3. 4.

Animated Gif/Screenshot

HHarvest R877 R781

Workaround

Severity

bug S2

Your Environment

Possible Fix

lbwright22 commented 4 years ago

Tamar negative stock

luisramos0 commented 4 years ago

I will not have time to look into this any further until Monday. Maybe another dev will?

I just looked at the stock movement in the DB, it looks like there was a rush to the asparguss, yes. But movements were a few seconds apart... maybe enough to make our concurrency solution to break...

image

The stock is at -3 now, looks like R736665060 got the last 3 items at 2020-04-20 12:10:35.713315 and for some reason R520367424 managed to get another (out of stock by then) at 2020-04-20 12:10:53.362846

Very similar situation for Mint where the same R520367424 got items that were made out of stock 20 seconds before...

emilyjeanrogers commented 4 years ago

We received an email from Autarky Farms on April 21st about an instance of overselling that they had experienced. Form the customer "I uploaded only 10 tomato bags but managed to sell 20 which seemed to maybe happen when it was in someone's cart but appeared to be available?"

Still waiting for specific order details but the Order cycle was coordinated by North Arm Farms, OC = "Orders for 22/4/20"and the product was wither Tomatoes Large or Tomatoes Tomatoes 400g

luisramos0 commented 4 years ago

ok, thanks for the info :+1:

In the orders of that OC now they dont have any "Tomatoes - 400g". They have 13 "Tomatoes - Large" in completed orders in that OC and 3 of them are in 3 orders in the cart state. So, only 13 were sold in this OC.

The checkouts were not very close, only the first two ones with a difference of 30 secs, the last checkouts of Tomatoes were several hours apart.

They sold a few more items in OC "Orders for 6/5/20" and now the stock is zero.

We dont have a way to see when did the stock reach zero and what orders should not have been placed by OFN... Maybe we need more logging here so we can see what was the stock available for the product at the time of checkout.

mbudm commented 4 years ago

Maybe we need more logging here so we can see what was the stock available for the product at the time of checkout.

I can highly recommend structured logging to solve these sorts of problems (providing context to the logged event). I've used honeycomb.io a lot for this.

luisramos0 commented 4 years ago

Interesting, honeycomb is an alternative to ELK, right? We only have bugsnag (it's serving us incredibly well). Bugsnag supports metada (end of this page) but it's not the same as having a full distributed logging infra like ELK. I think @Matt-Yorkley is familiar with ELK. I wonder if there's a case for us to switch our logs to one of these systems in OFN. We have this nice 3 years old post about the topic in discourse.

Matt-Yorkley commented 4 years ago

I think we can use ELK when we get to Rails 4...

RachL commented 4 years ago

@luisramos0 do you know what we can do with this issue? Are we closing it or are there bits we can keep as an s3?

luisramos0 commented 4 years ago

Hello @RachL this issue is on my todo list. I want to run some quick tests and check #5784 I'll get back here soon and share my findings. I think this will be an S3 to convert the snail to an error message. I believe the user is now seeing a snail when buying at the same item as another user at the same time.

luisramos0 commented 4 years ago

We cant really replicate this problem here....

5406 is a step forward for sure BUT

actually, in #5406 I found that the auto test we have for concurrent checkout is not really working :face_with_head_bandage: https://github.com/openfoodfoundation/openfoodnetwork/pull/5406#issuecomment-648883511

and @sauloperez comment is a very solid theory that says that the solution in #5406 will not be successful in stopping this issue here #5300 from happening :-( https://github.com/openfoodfoundation/openfoodnetwork/pull/5406#issuecomment-659480718 But maybe, I say maybe, #5406 did make this #5300 less likely to occur.... Based on this we could wait for the next occurrence of this issue... Or we can continue to work on this, following @sauloperez's suggestion above (change the form to include lock_version).

I am not sure. @sauloperez what do you think? Do you know how to do the lock_version trick on the web form? If it's an easy thing I'd go for it now.

sauloperez commented 4 years ago

sure, it's just about adding a hidden form field so we provide lock_version to the backend. Re the test, I'm not sure it's worth all the effort that would require turning our single-process tests into something that mimics a production scenario.

RachL commented 4 years ago

Then let's do this? :-)

In regular times I would have suggested to not handle it now. The problem is, the next time it will occur it will be again because we have a massive flow of users hitting our shops at the same time. It will produce even more stress to have to handle this under a flow of other bugs. Also we know how to reproduce it now, so I would take care of it now as we know how to do it. Thoughts?

sauloperez commented 4 years ago

Agree. The way I see it is that so far we have only invested resources and until we don't finish that last step we won't get any returns out of it. So far, it's only cost.

Matt-Yorkley commented 4 years ago

The lock_version attribute is on StockItem though, right? Wouldn't we need to add a hidden field for every stock item of every variant/variant_override in the cart?

Matt-Yorkley commented 4 years ago

The hidden field trick seems to be aimed at updating a single record, but here we're updating multiple records...

Matt-Yorkley commented 4 years ago

All we need is a little bit of Optimistic Concurrency Control... how hard could it be? :trollface::trollface:

luisramos0 commented 4 years ago

Do you understand if the object needs to be in the form? Maybe adding this will work even if the form is for some other object? hidden_field(:stock_item, :count_on_hand)

Matt-Yorkley commented 4 years ago

So I think the "vanilla" Optimistic Locking in Rails applies to all the fields of a single record, but here we're dealing with multiple associated objects of associated objects of associated objects, eg: stock_items of variants of line_items. In this case we're only concerned with the spree_stock_items.count_on_hand field.

There's lots of moving parts in the checkout process, it's like an avalanche. I wonder at which point we reduce the stock levels? After payment?

luisramos0 commented 4 years ago

There's lots of moving parts in the checkout process, it's like an avalanche. I wonder at which point we reduce the stock levels? After payment?

I am taking some time off to finish bye bye spree. IT will make a huge difference as we will be able to simplify some of the craziness :+1:

In OrderInventory.verify it's where the stock is unstock/restock, it's called on the line_item before_save callback update_inventory. But this unstocking will only happen (OrderInventory#74) if the order is complete. So, the unstock happens the first time the line is saved after the order is marked as complete. I think that will be on the order.finalize! (which is a checkout workflow callback after complete). (this complexity is nuts!)

Matt-Yorkley commented 4 years ago

Thanks for the heads-up on OrderInventory.verify :smile: I see a few other bits in Order.finaize! and Shipment.finalize! as well :+1:

Matt-Yorkley commented 4 years ago

Yeah so I think traditional optimistic locking isn't going to help us here in the murky depths of the checkout process. For example, if we've already taken payment and completed the order, and then we do some lock-clash-rescuing on stock levels, we can't really "abort" the sale nicely at that point, right? Given an example scenario where two users order the last carrot at the same time. The idea with optimistic locking (on a single record update) is that you can rescue from it (before making lots of database updates) and give the user nice feedback, or retry. The order and lots of associated records are already past that point by the time we get to reducing the stock levels...

Matt-Yorkley commented 4 years ago

But maybe we can do something at the start of the checkout process that will achieve a similar result...

Matt-Yorkley commented 4 years ago

I think we do a quick check of stock levels at the point the checkout page is first loaded, but here we're looking at the point that the checkout form is submitted. There's potentially quite a large time window there between page load and form submission, and if two users are checking out at roughly the same time, they will both get processed.

So I think we need a quick stock-check at the point the checkout is submitted but before the order is processed, and it needs to be able to handle concurrency...

luisramos0 commented 4 years ago

"the last carrot" :rofl:

yeah, we can make a stock check at the beginning of checkout_controller#update. Or we can go for a simple order.cancel if we detect a clash when moving the stock? Maybe easier?

How do we detect clashes?

Matt-Yorkley commented 4 years ago

I don't like the idea of taking payment, completing the order, then immediately cancelling it...

How do we detect clashes?

That's the fun bit :wink:

I'll have a play around with it. Maybe it's not as complicated as it sounds, and we just need a smarter stock-check...

luisramos0 commented 4 years ago

With the last PR I thought it would blow up and we could just recover from that. According to @sauloperez the PR didnt help, are we saying the stock item will go to negative without blowing up? I dont think that's possible with the validation we have now... what do you think will happen?

I am now thinking, if this is not an easy fix, we could just leave as is and we may not have a problem now. I still believe it will blow up for the second customer and the customer will see the generic checkout error message...

Matt-Yorkley commented 4 years ago

Ah, I see! I guess it blows up then... but after the payment has been taken? I'll give it a try in two browsers and see what happens. I think a stock check when the checkout is submitted would help.

Matt-Yorkley commented 4 years ago

Well I've just tried replicating this in master 5 times, and it actually looks pretty good. Each time one completed and the other was bounced to the cart page with a nice message and the "out of stock" feedback.

May be we can close this for now and reopen if it reoccurs?

Screenshot from 2020-08-07 18-14-45

luisramos0 commented 4 years ago

yeah, I'd close for now. We can come back to it if it does happen again. In your tests, do you know if the stock_item.count_on_hand validation was triggered? Because I think there is a stock validation somewhere in the checkout process and that will be a different code path when it fails.

sauloperez commented 4 years ago

so then, we added the lock_version in https://github.com/openfoodfoundation/openfoodnetwork/pull/5406 but is not having any effect :sweat_smile: I'd say we should remove it so we don't cause the false impression to future devs that concurrency is handled that way. Thoughts?

luisramos0 commented 4 years ago

Why do you say it's not having any effect? I'd say we need to build on top of it, not remove.

sauloperez commented 4 years ago

Why do you say it's not having any effect?

see https://github.com/openfoodfoundation/openfoodnetwork/pull/5406#issuecomment-659480718.

From what I understood with this discussion is that the existing mechanism is good enough. We have Rails' mechanism half-implemented and we don't plan on continuing it, at least for now :shrug:

luisramos0 commented 4 years ago

I am not sure I follow you here: in one comment you say "is not having any effect" and afterwards you say "the existing mechanism is good enough". I wonder if you are referring to the same lock_version mechanism on stock_items.

I think the current state is "we dont know if it is working or not" and we agree we will leave it like this until we see trouble again :-)

chezaorchard commented 4 years ago

@luisramos0 & @Matt-Yorkley we have a customer [order number R533682527] who has received an order confirmation email listing a product with quantity 0 and value $0 exactly as above bug - see screen shot.

Xnip2020-10-20_13-22-14

The producer [2286] has confirmed they do not have a negative value in their stock numbers for the product in question.

I'm not sure if this should reopen, or if I should log a new bug - please advise :)

lbwright22 commented 3 years ago

Orders placed on 18th November show evidence of this competitive stock level issue again.

Hub: Aberystwyth Food Hub Order Cycle: 21st November, Aberystwyth

There are several orders effected in the same manner as I found in April. ie customer invoice states products with zero quantity and zeros are also present in reports.

example is order R735665606 invoicestockzerocustomer

The caramel shortbread, marmalade and sourdough brownie are not in the customer's order though.

The presence of the items with zero quantity in reports leads to user confusement (and they contact the support team)

lin-d-hop commented 3 years ago

It seems this is another case of items not being held during checkout. This is increasing in frequency due to the additional load with lockdown and Christmas warm up.

We're talking about some significant work on this general area of issues in Product Curation on 30/11.

For now I'll close this again and we can continue to comment as and when it pops up again.