ppetzold / nestjs-paginate

Pagination and filtering helper method for TypeORM repositories or query builders using Nest.js framework :book::paperclip:
MIT License
451 stars 100 forks source link

Still having issues with AND filters on same column #821

Open Alfagun74 opened 1 year ago

Alfagun74 commented 1 year ago

Hey,

I'm still having trouble with the query string &filter.tags.name=$eq:2D&filter.tags.name=$and:$eq:Singleplayer. The issue seems to stem from applying the equality operator $eq separately to each tag name. The 'tags' field is an array of objects, and the current method doesn't give the results I want.

The SQL query it generates is trying to find games with both tags at the same time using the equality operator, resulting in an empty dataset. I'd prefer a functionality like $contains for more complex arrays or a solution allowing the use of 'AND' conditions with these specific tags.

Here's the SQL example it generates:

WHERE ( 1=1 AND ("__root_tags_rel"."name" = ? AND "__root_tags_rel"."name" = ?) ) AND ( "__root"."deleted_at" IS NULL ) -- PARAMETERS: ["2D","Singleplayer"]

The tags are stored as objects with associated data, like:

Game -> Tags -> Tag 1 -> name: Singleplayer <- filter
                      relatedGames
                      id
                      etc
             -> Tag 2 -> name: 2D <- filter
                      relatedGames
                      id
                      etc

I want to find games with both 'Singleplayer' and '2D' tags. While the $in operator helps find games with either tag, it doesn't meet the requirement of finding games only with both tags.

Is there a way to achieve this or a workaround allowing the use of 'AND' conditions for these specific tags or using $contains for more complicated arrays?

Am i doing something wrong? Do you have a test that covers this maybe?

By the way: ?filter.tags.name=$contains:2D,Singleplayer results in

WHERE ( 1=1 AND ("__root_tags_rel"."name" @> ?) ) AND ( "__root"."deleted_at" IS NULL )) "distinctAlias" ORDER BY "distinctAlias"."__root_id" ASC NULLS LAST, "__root_id" ASC LIMIT 100 -- PARAMETERS: [["2D","Singleplayer"]]
error:   ┏ error +2s
error:   ┃ [ 1] [Unhandled SqliteError occurred: unrecognized token: "@"]
error:   ┃ [ 2] {
error:   ┃ [ 3]   context: 'LoggingExceptionFilter',
error:   ┃ [ 4]   error: SqliteError: unrecognized token: "@"

Related: https://github.com/ppetzold/nestjs-paginate/issues/402

Additional Info:

paginate(query, this.gamesRepository, {
      paginationType: PaginationType.TAKE_AND_SKIP,
      defaultLimit: 100,
      maxLimit: NO_PAGINATION,
      nullSort: "last",
      relations: ["tags"],
      sortableColumns: [
        "id",
        "title",
        "release_date",
        "rawg_release_date",
        "created_at",
        "size",
        "metacritic_rating",
        "average_playtime",
        "early_access",
        "type",
      ],
      searchableColumns: ["title", "description"],
      filterableColumns: {
        id: true,
        title: true,
        release_date: true,
        created_at: true,
        size: true,
        metacritic_rating: true,
        average_playtime: true,
        early_access: true,
        type: true,
        "genres.name": true,
        "tags.name": true,
      },
      withDeleted: false,
    });
ppetzold commented 1 year ago

$contain only works with arrays. not 1:m

$and query looks good to me. possibly a bug? 🤔 what's the full query it generates? and what do you think is missing to give you the correct result?

Alfagun74 commented 1 year ago

Hi there @ppetzold , just returned from a short trip.

Using &filter.tags.name=$eq:2D&filter.tags.name=$and:$eq:Singleplayer or &filter.tags.name=$eq:2D&filter.tags.name=$eq:Singleplayer which should do the same i thing

It ran the following query:

SELECT DISTINCT "distinctAlias"."__root_id" AS "ids___root_id",
                "distinctAlias"."__root_id"
FROM
  (SELECT "__root"."id" AS "__root_id",
          "__root"."created_at" AS "__root_created_at",
          "__root"."updated_at" AS "__root_updated_at",
          "__root"."deleted_at" AS "__root_deleted_at",
          "__root"."entity_version" AS "__root_entity_version",
          "__root"."rawg_id" AS "__root_rawg_id",
          "__root"."title" AS "__root_title",
          "__root"."rawg_title" AS "__root_rawg_title",
          "__root"."version" AS "__root_version",
          "__root"."release_date" AS "__root_release_date",
          "__root"."rawg_release_date" AS "__root_rawg_release_date",
          "__root"."cache_date" AS "__root_cache_date",
          "__root"."file_path" AS "__root_file_path",
          "__root"."size" AS "__root_size",
          "__root"."description" AS "__root_description",
          "__root"."website_url" AS "__root_website_url",
          "__root"."metacritic_rating" AS "__root_metacritic_rating",
          "__root"."average_playtime" AS "__root_average_playtime",
          "__root"."early_access" AS "__root_early_access",
          "__root"."type" AS "__root_type",
          "__root"."box_image_id" AS "__root_box_image_id",
          "__root"."background_image_id" AS "__root_background_image_id",
          "__root_box_image_rel"."id" AS "__root_box_image_rel_id",
          "__root_box_image_rel"."created_at" AS "__root_box_image_rel_created_at",
          "__root_box_image_rel"."updated_at" AS "__root_box_image_rel_updated_at",
          "__root_box_image_rel"."deleted_at" AS "__root_box_image_rel_deleted_at",
          "__root_box_image_rel"."entity_version" AS "__root_box_image_rel_entity_version",
          "__root_box_image_rel"."source" AS "__root_box_image_rel_source",
          "__root_box_image_rel"."path" AS "__root_box_image_rel_path",
          "__root_box_image_rel"."media_type" AS "__root_box_image_rel_media_type",
          "__root_box_image_rel"."uploader_id" AS "__root_box_image_rel_uploader_id",
          "__root_tags_rel"."id" AS "__root_tags_rel_id",
          "__root_tags_rel"."created_at" AS "__root_tags_rel_created_at",
          "__root_tags_rel"."updated_at" AS "__root_tags_rel_updated_at",
          "__root_tags_rel"."deleted_at" AS "__root_tags_rel_deleted_at",
          "__root_tags_rel"."entity_version" AS "__root_tags_rel_entity_version",
          "__root_tags_rel"."rawg_id" AS "__root_tags_rel_rawg_id",
          "__root_tags_rel"."name" AS "__root_tags_rel_name"
   FROM "game" "__root"
   LEFT JOIN "image" "__root_box_image_rel" ON "__root_box_image_rel"."id"="__root"."box_image_id"
   AND ("__root_box_image_rel"."deleted_at" IS NULL)
   LEFT JOIN "game_tags_tag" "__root___root_tags_rel" ON "__root___root_tags_rel"."game_id"="__root"."id"
   LEFT JOIN "tag" "__root_tags_rel" ON "__root_tags_rel"."id"="__root___root_tags_rel"."tag_id"
   AND ("__root_tags_rel"."deleted_at" IS NULL)
   WHERE (1=1
          AND ("__root_tags_rel"."name" = $1
               AND "__root_tags_rel"."name" = $2))
     AND ("__root"."deleted_at" IS NULL)) "distinctAlias"
ORDER BY "distinctAlias"."__root_id" ASC NULLS LAST,
                                         "__root_id" ASC
LIMIT 100 -- PARAMETERS: ["2D","Singleplayer"]

It seems like there's an issue with the query. Specifically in the WHERE clause. The condition

("__root_tags_rel"."name" = $1 AND "__root_tags_rel"."name" = $2)

is looking for one tag where the name is both "2D" and "Singleplayer" simultaneously, which is not possible.

Correct me if im wrong but if we want to find entries that have both the "2D" and "Singleplayer" tags (among others of course), we would need to check for each tag separately and use an appropriate JOIN condition, right?

This is really a bummer, as filters should always narrow down instead of broadening the search and i can't implement that behaviour with nestjs-paginate right now. :(

Alfagun74 commented 11 months ago

@ppetzold Hi Philipp, I hope you're doing well. Just checking if you had a moment to look into the issue. Not wanting to bother you, but i am about to publish a larger release of my backend on christmas eve to about 2000 users, and the filters on the same columns are causing me trouble ux-wise by broadening searches instead of narrowing them down.

No rush, but if you could take a peek before, it would be a huge help. Thanks a bunch for maintaining this and happy holidays in advance! 🎄😊

ppetzold commented 11 months ago

does &filter.tags.name=$eq:2D&filter.tags.name=$or:$eq:Singleplayer work?

otherwise I suggest to get a bit hacky for your release. like handle the query param filter.tags manually and pass to where config as needed.

there are just so many edge cases on filtering on relations which are not handled well yet. relationships type, inner/outer join etc. def needs more love :D

vsamofal commented 11 months ago

yes, it isn't possible to generalize this anyhow as for me.

what Alfagun want's is tricky, and I even not sure what is the proper way to do it in terms of performance.

let's say we have blogs and blog categories, and want to get all blogs that belongs to two categories A and B

and so do it we can something like this:

SELECT b.*
FROM blogs b
JOIN blog_categories bc ON b.blog_id = bc.blog_id
WHERE bc.category_name IN ('A', 'B')
GROUP BY b.blog_id
HAVING COUNT(DISTINCT bc.category_name) = 2;

or like this

SELECT *
FROM blogs b
WHERE EXISTS (
    SELECT 1
    FROM blog_categories bc1
    WHERE bc1.blog_id = b.blog_id
    AND bc1.category_name = 'A'
) AND EXISTS (
    SELECT 1
    FROM blog_categories bc2
    WHERE bc2.blog_id = b.blog_id
    AND bc2.category_name = 'B'
);

anyway I don't think we can generalize is somehow, it's quite unique request, and that's not true that adding filters should always narrow down a search, open amazon and select brand a and brand b, you will see more results

I think we should just close this issue

Alfagun74 commented 11 months ago

@vsamofal

Certainly, because amazons intention is to maximize sales by showing you as much as possible to buy. However, in situations such as a digital library like mine, it is essential to use filters to narrow down searches according to individual preferences.

If you explore Steam and specifically search for Multiplayer Coop games, you'll notice fewer results compared to a general multiplayer search.

Similarly, when using booking.com to locate a hotel with a double bed near the city center, you wont receive hotels that are 10km away from the city center just because they have a double bed right?

It's undoubtedly a valid use case. Some services even offer both AND and OR based filtering with a switch.

image

image

vsamofal commented 11 months ago

@Alfagun74 your example with a booking is not valid for this case, because this filter will be on different fields, and your use case is the same field, where "or" condition is natural and has nothing to do with amazon willing to sale more

your use case unique and can't be generalized as for me, that's why you can build custom where condition as Phillip suggested

Alfagun74 commented 11 months ago

@vsamofal Excuse me for the following sarcastic answer, but well, congratulations on your keen insight into booking.com's data structure. I must have missed the memo about you being their top data architect. My apologies suggesting such a unreasonable use case.

But fear not, for I shall enlighten you further. Prepare to be mindblown by 10 more examples off the top of my head where my use case isn't just a unique snowflake:

  1. Filtering blog posts by multiple tags (e.g., "programming" AND "tips").
  2. Searching for recipes with specific ingredients (e.g., "chocolate" AND "almonds").
  3. Academic research papers tagged with both "science" AND "technology."
  4. Filtering Job listings requiring skills in both "Python" AND "JavaScript."
  5. News articles covering events in both "politics" AND "economy."
  6. Music tracks categorized as both "rock" AND "instrumental."
  7. Movie recommendations with genres like "comedy" AND "thriller."
  8. Software tools compatible with both platforms "Windows" AND "Mac."
  9. Finding Digital art showing both "impressionism" AND "surrealism" styles.
  10. DIY projects combining materials like "wood" AND "metal."

You do see a pattern... right? As soon as you group entities into n:n-relationships with categories, tags, platforms, time-periods or genres and want to offer a search to find combinations of these you will bump into my problem. Do you still find my use case as rare as before?

With the exception being of course, that i save the tags as a string array and use $contain, which is absolutely horrible if i want to be able to change tags.

vsamofal commented 11 months ago

@Alfagun74 I provided you queries that may do this, but because it's many to many and you need to have 2+ rows on this after joining it's tricky, I'm not really sure how we can generalize it, and PR's are always welcome here, if you have an idea how to do it properly without breaking whatever is already available.