openfoodfoundation / openfoodnetwork

Connect suppliers, distributors and consumers to trade local produce.
https://www.openfoodnetwork.org
GNU Affero General Public License v3.0
1.1k stars 714 forks source link

Subscription Orders not created for Schedule #4396

Closed lin-d-hop closed 4 years ago

lin-d-hop commented 4 years ago

Description

A user has reported that one customer order was not created from a schedule. Upon closer inspection I can't find evidence that any of the Orders for this Schedule and Order Cycle were created. Other subscription orders in the same order cycle but for a different schedule were created successfully.

User: Banc Organics Order Cycle: w29 (closed October 15th) Schedule: Fortnightly No 1 Reported missing order: (ping me on slack for the customer name and email).

Orders from schedule 'Weekly' (on the same order cycle) appear to have gone through correctly. However I can't find any orders for schedule 'Fortnightly No 1'.

More recently schedule 'Fortnightly No 2' appears to have generated the correct orders alongside the 'Weekly' schedule.

Expected Behavior

Subscription orders should be generated for all schedules.

Actual Behaviour

Subscription orders were not generated for a schedule.

Steps to Reproduce

  1. Unsure. I haven't tried to fully replicate.

Animated Gif/Screenshot

Workaround

Severity

bug-s2: a non-critical feature is broken, no workaround

Your Environment

Possible Fix

I suspected it might be insufficient stock, but found that this appeared to happen for all orders in the Schedule not just the reported one. I suspected that this might be user error, adding the schedule after the OC opened. However I have no way of evidencing this. There are other customers that have orders in both Fortnightly No 1 and Fortnightly No 2 that received No 2 but not No 1.

luisramos0 commented 4 years ago

I found that #4322 is happening to this enterprise, user email starting with long... on the weekly schedule. I think this is not related to this issue.

EDITED (to make it simpler):

When did user add schedule to OC? (TLDR: we dont know)

w29 opened on 10Oct. The updated_at timestamp is 9thOct, it means the OC was not edited after it opened BUT (I tested locallly) adding a schedule to the OC does not change the timestamp, so the user might have added the schedule after the OC was opened or even after the OC has closed. I dont think we have a way to know this in the DB or logs.. We can add a timestamp to the table order_cycle_schedules so that this can be retrieved in future cases.

Proxy Orders were not created for this OC/schedule

in the DB there are no proxy orders for this OC/schedule (w29, "Fortnightly no 1") combination: select * from subscriptions s left join proxy_orders po on (po.subscription_id = s.id) where s.schedule_id = 17 and po.order_cycle_id = 203650 order by po.updated_at desc;

By reading the code a little better, I see that even if the schedule is added to the OC after the OC is opened, the (proxy) orders should still be generated. So I tested this locally, and this is true:

User error if...

The only possibility of being user error is if the schedule was added to the OC w29 after it was closed, i.e. 15Oct.

The code: ProxyOrderSyncer

In terms of code, the creation of proxy orders (what has not happened in this issue) happens in ProxyOrderSyncer.sync that is called when the oc is updated, the schedules are changed and also when the subscription is updated...

Possible Workaround --- What do we do?

One workaround that can be tried for this situation next time is to remove and add the schedule to the OC again. Right now with the OC already closed, I am not sure what else we can do.

Executive Summary

So, this is either user error (schedule added after OC was closed) or a bug somewhere in ProxyOrderSyncer.

Dev Time so far: 2hours :-(

luisramos0 commented 4 years ago

There's one thing I will try now: copy UK live DB to my laptop, run the app and try to reopen the OC and verify if the orders are being created... will update here with result.

luisramos0 commented 4 years ago

Investigating locally

On my laptop with uk live data, if I change the close date of the w29 OC and open it, the proxy orders are correctly generated....

Also, after having w29 open, adding and removing the fornightly schedule from the OC will add and remove the proxy order from the subscription. Which is correct 👍

I think I found a bug but it's not the cause of the issue for sure

I have found a problem when adding a schedule to the OC in the OC edit page BUT this cannot be the problem live because this method of adding the schedule will update the "updated_at" tag of the OC, and the updated_at timestamp of w29 in live is before the OC was opened 9Oct. So, this cannot be the problem: in the live data, if the user added the schedule to the OC after the OC was open, it was using the OC list page by clicking on the schedule name on another OC.

Anyway, the problem I found is in OrderCycleForm#schedule_sync_required?, when a schedule is added to an open Order Cycle, the subscriptions of that OC must be updated. The code for this is in place but the logic looks broken. Locally, when I add a schedule to an open OC, the proxy orders are not generated because schedule_sync_required? logic reports no new schedule id was added, but it was.

What to do about this issue?

I am not sure... I think we need to find another occurrence in the live environment...

lin-d-hop commented 4 years ago

Thanks @luisramos0 Is there anything we can do to improve Subs logging? Such that we might have more information next time.

luisramos0 commented 4 years ago

I think there's a design flaw in the subs module: absence of logging. I mean, this subs module is full of automagic stuff like generating proxy orders and then real orders, and then completing those orders and sending summaries to people...

we will be in this situation again and again as the code is used, things will happen and users will want to know what has happened? why this or why that. Sometimes the user will fail, miss the point, sometimes we will fail, bugs.

To support such a process we need logging that lets us monitor what automatic processes were executed, and what/why did they do what they did. I think we need to add some logging capability (in some form) to this subs module.

I raised the logs question 1 and a half years ago, I think we need to revisit it... imo the subscriptions module needs logs...

this needs input from other devs @kristinalim @sauloperez @mkllnk @Matt-Yorkley the questions are:

kristinalim commented 4 years ago

Hi @luisramos0! Great investigation.

What do you think?

sauloperez commented 4 years ago

I think what we're basically missing and need are the logs Rails provides in the app but in the delayed job worker. AFAIK all the stuff we are missing happens in a background job. That will provide us with the data we need.

I think a solution would be to configure the Rails logger to log into log/delayed_job.log as Delayed::Worker.logger does. There we would see it all.

As for the solutions mentioned so far, I think they might not be needed for now as long as we have the log. Specifically, paper_trail, if we can avoid the load/space it may add to the DB for now :+1: . It's a gut a feeling more than proven science though.

mkllnk commented 4 years ago

I think logging is a good idea. I'm not sure that paper_trail is the right solution. It focuses on models and how they change but that doesn't tell us what code ran and why it changed something. We would need to monitor order cycles, subscriptions but also orders and payments.

Instead, it could be more useful to have logs like this:

[SubscriptionPlacementJob#perform] proxy_order_ids: [1, 2, 3, ...]
  initialise order 1
    completed: true
    changes: {}
    placement email sent
  initialise order 2
    ...
luisramos0 commented 4 years ago

ok, we have 3 completely different suggestions :-D Kristina - log with paper_trail Pau - use existing active record logs Maikel - use custom logging

I agree with Maikel in not using paper_trail. I understand it's very convenient but I think we need something more specific to the process, not the data. So, if no one disagrees I'd say we can do what Pau suggested, enable existing logs on delayedjob, and do what Maikel suggests: add a custom logging capability to the app and issue some specific log messages from the code.

lin-d-hop commented 4 years ago

Not enough information. Replacing with #4462 and closing.