Open gabe-gfm opened 8 months ago
@kaidaguerre, Could you please look into this issue?
Sorry for the delay @gabe-gfm , sure we'll look into this 👍
Perhaps this is related? https://github.com/turbot/steampipe-postgres-fdw/pull/440
@jreyesr @kaidaguerre
Unfortunately, this was not resolved using the most aws plugin + postgres foreign data wrapper (us-docker.pkg.dev/steampipe/steampipe/fdw:1.11.0-rc.6
). It still returns all rows when attempting to filter for a list of values inside the jsonb image_tags array.
I did notice a particular log surface in my SQL client (not steampipe query):
2024-04-29 17:22:28 fdwGetForeignPlan - best_path->fdw_private is NULL. Defaulting to setting canPushdownAllSortFields to true
I identified a better workaround for anyone else encountering this.
You may specify MATERIALIZED
(docs) in a Postgres query to prevent the demonstrated incorrect query filter results.
MATERIALIZED
is an optimization hint resulting in explicit fetching of a common table express (CTE) result without trying to push down potential optimizations into the query plan.
This example query gave me correctly filtered results.
-- id_tag_repo is a mock for identifying interesting AWS container task definitions linked to ECR images
with id_tag_repo as
(
select 1 as id, to_jsonb('ZZZ'::text) as image_tag, 'my-repo-A' as image_repository
union
select 2 as id, '"latest"'::jsonb as image_tag, 'my-repo-B' as image_repository
union
select 3 as id, '"latest"'::jsonb as image_tag, 'my-repo-A' as image_repository
),
ecr_info as MATERIALIZED -- MATERIALIZED used to work around steampipe-fdw bug
(
select repository_name, image_tags
from aws_ecr_image
)
select
id_tag_repo.image_tag as target_tag
, id
, ecr.image_tags @> image_tag as is_tag_in_tags -- DEBUG: Matches where condition. Expect to see only true rows
, repository_name
, image_tags
from ecr_info ecr
join id_tag_repo
on ecr.repository_name = id_tag_repo.image_repository
where
ecr.image_tags @> image_tag
@gabe-gfm good that you've found a workaround. I need to dig into why the JSON join condition is causing this issue
Describe the bug I'm seeing unexpected results using a
WHERE
clause on theaws_ecr_image
table. All rows from aJOIN
condition are being returned even though the query specifies most rows should be excluded by aWHERE
condition.This reproduction case involves joining a second table to provide a target label string for evaluating whether the target is in the JSONB list of
aws_ecr_image.image_tags
.Insightfully, I found the problem does not occur qhen querying local-table-copy of the
aws_ecr_image
table. This leads me to suspect it's a foreign-data-wrapper bug. But I'm not confident in which project the bug exists (steampipe-postgres-fdw
,steampipe-plugin-aws
, orsteampipe
?).Steampipe version v0.21.4
Plugin version (
steampipe plugin list
) turbot/aws 0.132.0To reproduce
aws_ecr_image
image_tag
inimage_tags
using the>@
JSONB inclusion test.Example query
is_tag_in_tags
column may show true, false, null, but the where condition should limit it to only true.Buggy Results:
Expected behavior The expected behavior can be easily demonstrated by running the same query on a local copy of the a foreign-data-wrapper table.
from ecr_copy
rather thanfrom aws_ecr_image
is_tag_in_tags
column shows true.Expected results:
Additional context I found it difficult to discover how I might locally rebuild
steampipe-ftw
or the project's postgres container to investigate if this could be a known postgres bug.