sciencehistory / scihist_digicoll

Science History Institute Digital Collections
Other
11 stars 0 forks source link

Export from cart feature: what if the cart contains thousands of works? Optimize. #2660

Open eddierubeiz opened 3 months ago

eddierubeiz commented 3 months ago

Once in a while, we're going to want to export thousands of items from the cart, so we want it to work well. We may need to spawn a job and show a progress bar (see e.g. the UI in the downloads controller.)

jrochkind commented 3 months ago

First step, test with varying numbers of works and determine roughly how long it takes to export (on staging or production) for X=so many works.

It's also somewhat possible it could be speeded up (streaming the CSV as you create it is hypothetically something that can be done with rails), not sure if this is worthwhile avenue.

eddierubeiz commented 2 days ago

It turns out exporting lots of works to a CSV is pretty fast. In staging:

eddie.works_in_cart << Work.where("friendlier_id ILIKE ?", "%a%") && ""
eddie.works_in_cart << Work.where("friendlier_id ILIKE ?", "%b%") && ""
eddie.works_in_cart << Work.where("friendlier_id ILIKE ?", "%c%") && ""
eddie.works_in_cart = Set.new(eddie.works_in_cart)
eddie.works_in_cart.count
=> 4947

This results in a 5.6 MB file that downloads in roughly 8.5 seconds. I'd argue that this is perfectly fine for very occasional staff use, and that there's no need to optimize the process.

jrochkind commented 2 days ago

that downloads in roughly 8.5 seconds.

Just for us to have a sense of what's going on, can you please look at how long, on staging, the action actually takes for the Rails app to return a response? That should be in logs. It shouldn't be more than the 8.5 seconds but could be a lot less, which would be good -- the rails worker may no longer be busy once it's handed off the data, it shouldn't be waiting on the network speed of the client.

So 8.5 seconds really is kind of too long to hold up a Puma worker.

But I guess if we think this prob won't happen that much anyway, and it's unlikely more than one staff person would be doing it at once, I guess we'll just take the risk?

eddierubeiz commented 2 days ago

A little under 8 seconds, yeah. INFO -- : [f8ced198-d881-42a5-8850-045788813e3a] method=POST path=/admin/cart_items/report format=html controller=Admin::CartItemsController action=report status=200 allocations=7091244 duration=7736.20 view=0.00 db=1903.91 ua=Chrome-129/Mac-10.15 ip=68.83.189.71