openfoodfoundation / openfoodnetwork

Connect suppliers, distributors and consumers to trade local produce.
https://www.openfoodnetwork.org
GNU Affero General Public License v3.0
1.12k stars 724 forks source link

Micromarché Hubs on coopcircuits.fr opens very slowly #12870

Open nicolasVr opened 2 months ago

nicolasVr commented 2 months ago

Please use #12870 Index Order cycles in Clockify when working on this issue

Description

Micromarché manages several Hubs (for instance https://coopcircuits.fr/micromarche-la-resto-du-grand-t/shop#/shop_panel ). Those hubs are linked to many producers and have a big order background. Opening the shop page takes 30 to 40 seconds and, logically, buyers complain that is long and some give up.

Expected Behavior

Shop page should open in a few seconds.

Actual Behaviour

Opening a page takes 30 to 40 seconds

Steps to Reproduce

  1. Open https://coopcircuits.fr/lecoupdarrosoir/shop
  2. Check how long it takes to open page

Workaround

No kwown workaround

Severity

s2 ?

RachL commented 2 months ago

@openfoodfoundation/developers without fixing it first, is there a way to easily find the process that takes so much time? If it's relationships FR support could consider to divide their shops between several email accounts.

nicolasVr commented 2 months ago

I made various tests to check what makes the shop slow :

nicolasVr commented 2 months ago

I have also added the same type of order cycle on the new shop with no effect (normal speed). Last tests were made on staging

mkllnk commented 2 months ago

This issue was first reported on Slack on 2024-09-11.

There was a series of optimsations for this page before this performance issue was reported. One optimsation just the day before.

mkllnk commented 2 months ago

Rendering of the shops partial seems to take a long time but sometimes it's really fast:

[2024-08-25]: Rendered shops/index.html.haml (Duration: 83606.7ms | Allocations: 2679115)
[2024-08-25]: Rendered shops/index.html.haml (Duration: 119988.4ms | Allocations: 7202747)
[2024-08-25]: Rendered shops/index.html.haml (Duration: 31.2ms | Allocations: 5526)

But trawling through the logs like this it's not easy to see actual trends. So let's ask New Relic. We only have data for au-prod there but maybe we can see trends: shops

The biggest change is actually on July 9, when a big product refactor was deployed:

This performance problem may be different to the described issue here though. There seems to be a particular bad circumstance with French data.

mkllnk commented 2 months ago

I copied the French database to my local server and tested the shops page. It loads in about 10 seconds. Reverting #12827 doesn't help, then it takes 12 seconds.

Going back to the code before all those optimisations, it takes 11 seconds to load. So no significant difference here.

nicolasVr commented 2 months ago

FYI the load time is specific to Micromarche shop. I made many tests and I think it is linked to the orders background. But not only the number of orders because some shop have more orders and are much faster. Thanks for caring anyway

mkllnk commented 2 months ago

Oh, okay, I looked at the wrong thing then. I looked at shops page because that's what you first reported on Slack. Now I see that this is for one specific shop.

mkllnk commented 2 months ago

So the individual shop page loads in 13 seconds for me but the shop is closed at the moment.

nicolasVr commented 2 months ago

Sorry for the misleading slack posts. I coud see no loading difference between order cycle open or closed. During "french openiing hours" on prod loading time is more 30-40 seconds. On staging I have created a hub "micromarche test" that is a copy of micromrché - Agronaute without of course the background orders. Loading time is about 5 seconds.

mkllnk commented 2 months ago

On that page, we have a similar trend though: shop

Since the product refactor, the execution time multiplied. The thing that increased is:

A bit more specific:

app/serializers/api/cached_enterprise_serializer.rb:84:in `distributed_properties'
SELECT DISTINCT spree_properties.*
 FROM "spree_properties" INNER JOIN "producer_properties" ON "producer_properties"."property_id" = "spree_properties"."id" INNER JOIN "enterprises" ON "enterprises"."id" = "producer_properties"."producer_id" INNER JOIN "spree_variants" ON "spree_variants"."deleted_at" IS ? AND "spree_variants"."supplier_id" = "enterprises"."id" INNER JOIN "spree_products" ON "spree_products"."id" = "spree_variants"."product_id" INNER JOIN "spree_variants" "variants_spree_products" ON "variants_spree_products"."deleted_at" IS ? AND "variants_spree_products"."product_id" = "spree_products"."id" INNER JOIN "exchange_variants" ON "exchange_variants"."variant_id" = "variants_spree_products"."id" INNER JOIN "exchanges" ON "exchanges"."id" = "exchange_variants"."exchange_id" INNER JOIN "order_cycles" ON "order_cycles"."id" = "exchanges"."order_cycle_id"
 WHERE "exchanges"."incoming" = $? AND "exchanges"."receiver_id" = $? AND (order_cycles.orders_open_at <= ? AND order_cycles.orders_close_at >= ?)

The code line referenced by New Relic is ambiguous. Looking at the SQL, it seems to be the distributed_producer_properties method. But that whole file hasn't been changed in the product refactor.

mkllnk commented 2 months ago

Loading that particular shop locally with French data reveals the same thing. The majority of time is spent in the distributed_properties.

I think I have to take a step back here. I was hoping to find a trivial fix like reverting a pull request. But this database query needs some really good focus. Maybe caching will be an answer.

The sad thing is that there are only 18 properties. So it takes all this time just to combine a lot of data in the database to know which of these 18 properties to display so that we can filter in the shopfront and don't show filters for properties that don't have products here.

nicolasVr commented 2 months ago

Does it suggest a workaround we could use in the meantime? We are about to help the hub to build new shops without order history, which should solve the problem for some time but not without harm and inconvenience.

mkllnk commented 2 months ago

Good question. The query doesn't involve orders. But it does involve order cycles. They are then filtered by active order cycles but you could try to clean out old order cycles.

mkllnk commented 2 months ago

Just to explain a bit, to find the product properties to show it looks at all producers with properties, then all the products they have and all the order cycles these products are in. Then it filters the order cycles by the distributor and opening times.

Dev note: order_cycles doesn't have any indexes. At least the distributor id would be helpful to index here.

dacook commented 2 months ago

Dev note: order_cycles doesn't have any indexes.

😱 That doesn't sound good! I just did a search and found 18 tables don't have indexes. Some would have tiny datasets and should be ok, but it seems like we should be indexing order_cycles.

~At least the distributor id would be helpful to index here.~

I'm curious why you crossed this out, I'll assume that thought isn't complete and won't try to pick it up..

dacook commented 2 months ago

It looks like we have regular issues with server load on FR prod, with downtimes being reported regularly. Last night, it was down for 8 minutes then 11 minutes. https://openfoodnetwork.slack.com/archives/GLJQ2LASF/p1727031576269109 Similar last weekend.

mkllnk commented 1 month ago

At least the distributor id would be helpful to index here.

I'm curious why you crossed this out, I'll assume that thought isn't complete and won't try to pick it up..

Well, order_cycles doesn't have a distributor id. It has the coordinator id but the distributors are in outgoing exchanges and they are indexed. The only fields on order_cycles worth indexing are the opening and closing time. But datetime doesn't always perform well with indexes, we need to proof the benefit. If the number of order cycles is the problem then maybe we need to archive old order cycles with a flag that's easily indexed.

RachL commented 1 month ago

@openfoodfoundation/developers There is a greenlight from budget circle on this first step. I've created #12870 Index Order cycles in Global clockify to track the time.

RachL commented 1 month ago

@nicolasVr I've missed a few meetings, but I don't see in the notes that a budget was dedicated to that, so I'm downgrading it to s3 // the global team won't be able to take this one before global core budget increases :(