web3-storage / web3.storage

DEPRECATED ⁂ The simple file storage service for IPFS & Filecoin
https://web3.storage
Other
503 stars 119 forks source link

Storage limit - Performance improv (index not used) #1642

Closed flea89 closed 2 years ago

flea89 commented 2 years ago

Context

TL;DR pin_content_cid_status_idx exists in both Prod and Staging but for some reason, it is not used in prod.

Update Content DAG Sizes performance greatly differs between prod and staging, despite the tables' sizes being comparable and the same indexes existing.

By digging into it, I noticed the following.

Explaining this query (both prod and staging):

  SELECT *
   FROM (
      SELECT COALESCE((
        SELECT SUM(dag_size) 
        FROM (
          SELECT  c.cid,
                  c.dag_size
          FROM upload u
          JOIN content c ON c.cid = u.content_cid
          JOIN pin p ON p.content_cid = u.content_cid
          WHERE u.user_id = 315318734258456601
          AND u.deleted_at is null
          AND p.status = 'Pinned'
          GROUP BY c.cid,
                  c.dag_size
        ) AS uploaded_content), 0)
    ) as up,
    (
      SELECT COALESCE((
        SELECT SUM(dag_size)
        FROM (
          SELECT  psa_pr.content_cid,
                  c.dag_size
          FROM psa_pin_request psa_pr
          JOIN content c ON c.cid = psa_pr.content_cid
          JOIN pin p ON p.content_cid = psa_pr.content_cid
          JOIN auth_key a ON a.id = psa_pr.auth_key_id
          WHERE a.user_id = 315318734258456601
          AND psa_pr.deleted_at is null
          AND p.status = 'Pinned'
          GROUP BY psa_pr.content_cid,
                  c.dag_size
        ) AS pinned_content), 0)
    )as psa;

(which should be equivalent to the remote procedure FUNCTION user_used_storage(query_user_id BIGINT) used to calculate storage limits) it is executed differently in the 2 envs.

Here you can find attached the 2 execution plans, and as you can see in prod is doing a Seq Scan of pins instead of using the index pin_content_cid_status_idx That seq scan could explain the huge difference in performance between the 2 env.

But I can’t find an obvious explanation why the 2 plans are different. The only difference I can think of is that pin_content_cid_status_idx was created CONCURRENTLY in prod, see here…. I know it can fail sometimes, (see this article) but by running

SELECT * FROM pg_class, pg_index WHERE pg_index.indisvalid = false AND pg_index.indexrelid = pg_class.oid;

the index doesn’t show up suggesting it’s valid and ready? prod staging

flea89 commented 2 years ago

Solved by dropping the index and re-creating