django-cms / django-filer

File and Image Management Application for django
https://django-filer.readthedocs.io/
Other
1.73k stars 573 forks source link

Thumbnail Rendering Performance Issue with directory listing #1462

Open CuriousLearner opened 1 month ago

CuriousLearner commented 1 month ago

Hi,

I'm encountering a performance bottleneck while utilizing django-filer version 3.1.1 alongside Django version 4.2.13 and easy_thumbnails version 2.8.5.

The issue arises when attempting to render thumbnails for images within directories, particularly noticeable during the directory listing view. Upon investigation using django-debug-toolbar, it appears that the SQL query execution time is excessively prolonged, leading to server timeouts.

Here's a snapshot of the SQL query generated by the directory listing view: Screenshot 2024-05-02 at 4 26 34 PM (1)

The SQL query formed by the view above is as follows:

SELECT "filer_file"."id",
       "filer_file"."polymorphic_ctype_id",
       "filer_file"."folder_id",
       "filer_file"."file",
       "filer_file"."_file_size",
       "filer_file"."sha1",
       "filer_file"."has_all_mandatory_data",
       "filer_file"."original_filename",
       "filer_file"."name",
       "filer_file"."description",
       "filer_file"."owner_id",
       "filer_file"."uploaded_at",
       "filer_file"."modified_at",
       "filer_file"."is_public",
       "filer_file"."mime_type",
       COALESCE(CASE WHEN "filer_file"."name" = '' THEN NULL WHEN "filer_file"."name" IS NOT NULL THEN "filer_file"."name" ELSE NULL END, "filer_file"."original_filename") AS "coalesce_sort_field",
       (
            SELECT U0."name"
            FROM "easy_thumbnails_thumbnail" U0
            INNER JOIN "easy_thumbnails_source" U1
                ON (U0."source_id" = U1."id")
            WHERE (U0."modified" >= (U1."modified") AND U1."name" = ("filer_file"."file") AND NOT (U0."name"::text LIKE '%upscale%') AND U0."name"::text LIKE '%\_\_40x40\_%')
            ORDER BY U0."modified" DESC
            LIMIT 1
        ) AS "thumbnail_name",
        (
            SELECT U0."name"
            FROM "easy_thumbnails_thumbnail" U0
            INNER JOIN "easy_thumbnails_source" U1
                ON (U0."source_id" = U1."id")
            WHERE (U0."modified" >= (U1."modified") AND U1."name" = ("filer_file"."file") AND NOT (U0."name"::text LIKE '%upscale%') AND U0."name"::text LIKE '%\_\_80x80\_%')
            ORDER BY U0."modified" DESC
            LIMIT 1
        ) AS "thumbnailx2_name",
        "auth_user"."id",
        "auth_user"."password",
        "auth_user"."last_login",
        "auth_user"."is_superuser",
        "auth_user"."username",
        "auth_user"."first_name",
        "auth_user"."last_name",
        "auth_user"."email",
        "auth_user"."is_staff",
        "auth_user"."is_active",
        "auth_user"."date_joined"
    FROM "filer_file"
    LEFT OUTER JOIN "auth_user"
        ON ("filer_file"."owner_id" = "auth_user"."id")
    WHERE "filer_file"."folder_id" = 289
    ORDER BY LOWER(COALESCE(CASE WHEN "filer_file"."name" = '' THEN NULL WHEN "filer_file"."name" IS NOT NULL THEN "filer_file"."name" ELSE NULL END, "filer_file"."original_filename")) ASC

To mitigate this performance concern, I've created two indexes which seem to have a positive impact on query execution time:

CREATE INDEX IF NOT EXISTS px_easy_thumbnails_thumbnail_source_id ON easy_thumbnails_thumbnail (source_id);
CREATE INDEX IF NOT EXISTS px_filer_file_owner_id ON filer_file (owner_id);

After Index creation, the query time was reduced significantly:

Screenshot 2024-05-02 at 5 04 22 PM

However, despite these optimizations, I'm uncertain if there are additional steps or considerations I may have overlooked. Your insights into resolving this performance issue would be greatly appreciated.

Thank you for your attention to this matter.

CuriousLearner commented 4 days ago

Hi,

I finally had some time to dig deeper into this issue and discovered that foreign keys, which should have already been indexed, were present. Writing some tests made me realize that the indexes I was adding were indeed improving performance, but they should have been there from the beginning.

Additionally, I wrote a test to compare indexes and found that I was duplicating indexes on the test database rather than the actual database.

In summary, there seems to be a problem with the project's database where these indexes were missing. I decided to add the necessary indexes.

I'm including the SQL here in case it helps anyone (though ideally, you won't need it 😅).

CREATE INDEX IF NOT EXISTS easy_thumbnails_thumbnail_source_id_5b57bc77 ON easy_thumbnails_thumbnail (source_id);
CREATE INDEX IF NOT EXISTS filer_file_owner_id_b9e32671 ON filer_file (owner_id);
fsbraun commented 4 days ago

@CuriousLearner Thanks for the update! I'd love to keep this open for the moment, to check if we need to add indexes. This probably needs to happen through Django's ORM, and might also affect easy thumbnails.