Closed leonjza closed 1 year ago
Hi @leonjza , sorry you're running into some errors!
Can you please share what the results of the subquery are (if it's hard to run through the query resource, you can hardcode a domain and run the query ad-hoc):
select
ip
from
net_dns_record
where
domain = "google.com"
and ip is not null
Also, can you please also share the results of running explain
on each of the queries you've tried, e.g.,
explain
select
ip_address,
jsonb_array_elements_text(domains) as domain
from
net_dns_reverse
where
ip_address in
(
select
ip
from
net_dns_record
where
domain = "google.com"
and ip is not null
)
and
explain
select
*,
ip_address,
jsonb_array_elements_text(domains) as domain
from
net_dns_reverse
where
ip_address in
(
select
ip
from
net_dns_record
where
domain = "google.com"
and ip is not null
)
Sometimes Steampipe's query planner doesn't identify key columns correctly, like ip_address
, which is required for any net_dns_reverse
table queries, which is why your first query may be failing incorrectly. The results from the queries I shared above should help us identify if this is the issue.
@leonjza Along with the details that @cbruno10 asked above, could you please try the same query by adding an order by
clause? For example:
select
ip_address,
jsonb_array_elements_text(domains) as domain
from
net_dns_reverse
where
ip_address in (
select
ip
from
net_dns_record
where
domain = 'google.com'
and ip is not null
order by ip
)
@cbruno10 and @Subhajit97, answered both below. Thanks for the fast replies ❤️
Can you please share what the results of the subquery are (if it's hard to run through the query resource, you can hardcode a domain and run the query ad-hoc):
select ip from net_dns_record where domain = "google.com" and ip is not null
> select
ip
from
net_dns_record
where
domain = 'google.com'
and ip is not null
+------------------------------+
| ip |
+------------------------------+
| 172.217.170.46 |
| 2c0f:fb50:4002:805::200e/128 |
+------------------------------+
Also, can you please also share the results of running
explain
on each of the queries you've tried, e.g.,explain select ip_address, jsonb_array_elements_text(domains) as domain from net_dns_reverse where ip_address in ( select ip from net_dns_record where domain = "google.com" and ip is not null )
+----------------------------------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN |
+----------------------------------------------------------------------------------------------------------------------------------------------------+
| ProjectSet (cost=0.00..20000100278749.00 rows=50000000 width=64) |
| -> Nested Loop Semi Join (cost=0.00..20000100024999.00 rows=500000 width=64) |
| -> Foreign Scan on net_dns_reverse (cost=0.00..20000000000000.00 rows=1000000 width=200) |
| -> Foreign Scan on net_dns_record (cost=0.00..200.00 rows=1 width=200) |
| Filter: ((ip IS NOT NULL) AND (domain = 'google.com'::text) AND (net_dns_reverse.ip_address = ip) AND (domain = 'google.com'::text)) |
+----------------------------------------------------------------------------------------------------------------------------------------------------+
and
explain select *, ip_address, jsonb_array_elements_text(domains) as domain from net_dns_reverse where ip_address in ( select ip from net_dns_record where domain = "google.com" and ip is not null )
+---------------------------------------------------------------------------------------------------------+
| QUERY PLAN |
+---------------------------------------------------------------------------------------------------------+
| ProjectSet (cost=20000000002500.00..20000000376256.00 rows=50000000 width=160) |
| -> Nested Loop (cost=20000000002500.00..20000000122506.00 rows=500000 width=96) |
| -> HashAggregate (cost=20000000002500.00..20000000002502.00 rows=200 width=200) |
| Group Key: net_dns_record.ip |
| -> Foreign Scan on net_dns_record (cost=0.00..20000000000000.00 rows=1000000 width=200) |
| Filter: ((ip IS NOT NULL) AND (domain = 'google.com'::text)) |
| -> Foreign Scan on net_dns_reverse (cost=0.00..600.00 rows=2 width=300) |
| Filter: (ip_address = net_dns_record.ip) |
+---------------------------------------------------------------------------------------------------------+
@leonjza Along with the details that @cbruno10 asked above, could you please try the same query by adding an
order by
clause? For example:select ip_address, jsonb_array_elements_text(domains) as domain from net_dns_reverse where ip_address in ( select ip from net_dns_record where domain = 'google.com' and ip is not null order by ip )
Works!
> select
ip_address,
jsonb_array_elements_text(domains) as domain
from
net_dns_reverse
where
ip_address in (
select
ip
from
net_dns_record
where
domain = 'google.com'
and ip is not null
order by ip
)
+----------------+----------------------------+
| ip_address | domain |
+----------------+----------------------------+
| 172.217.170.46 | jnb02s03-in-f14.1e100.net. |
+----------------+----------------------------+
Curious, what can I use as a hint to know to use the order by?
@leonjza, per the net_dns_reverse
table design, the column ip_address
is required when running any query against that table.
If you check the query plan for the query without order by
:
+----------------------------------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN |
+----------------------------------------------------------------------------------------------------------------------------------------------------+
| ProjectSet (cost=0.00..20000100278749.00 rows=50000000 width=64) |
| -> Nested Loop Semi Join (cost=0.00..20000100024999.00 rows=500000 width=64) |
| -> Foreign Scan on net_dns_reverse (cost=0.00..20000000000000.00 rows=1000000 width=200) |
| -> Foreign Scan on net_dns_record (cost=0.00..200.00 rows=1 width=200) |
| Filter: ((ip IS NOT NULL) AND (domain = 'google.com'::text) AND (net_dns_reverse.ip_address = ip) AND (domain = 'google.com'::text)) |
+----------------------------------------------------------------------------------------------------------------------------------------------------+
You can see the query against the net_dns_reverse
table executes first, but the ip_address
depends on the output of the net_dns_record
table. Hence the error comes.
Filter: ((ip IS NOT NULL) AND (domain = 'google.com'::text) AND (net_dns_reverse.ip_address = ip) AND (domain = 'google.com'::text))
If you add the order by
clauses, often that will trick/change the planer. You can see that after using order by
clauses, the planner uses the below filter, where it takes the ip_address
from the net_dns_record
table output:
Foreign Scan on net_dns_reverse (cost=0.00..400.00 rows=2 width=200) |
| Filter: (ip_address = net_dns_record.ip)
@kaidaguerre Based on the results @leonjza shared in https://github.com/turbot/steampipe-plugin-net/issues/61#issuecomment-1593231257, why does adding *
to the select
statement cause the query planner to pick up the qual for ip_address
correctly?
Unfortunately, the Postgres query planner is a black box and it's very hard to pin down exactly what causes it to come up with the plan - it applies various heuristics.
The FDW attempts to influence the query planner by reporting a cheap cost for paths which provide required key columns, which does help, but we have no direct control of how it plans its queries, so it is sometimes necessary to be creative with how the queries are written to achieve the desired plan.
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 30 days.
Hey @leonjza , I believe the workaround of using order by
@Subhajit97 provided in https://github.com/turbot/steampipe-plugin-net/issues/61#issuecomment-1593318970 should work for you. Unfortunately, the FDW sometimes requires tricks, especially when working with required key columns.
If you're still seeing issues though, please let us know!
Describe the bug Selecting specific columns from
net_dns_reverse
appears to break with the error:Steampipe version (
steampipe -v
)Plugin version (
steampipe plugin list
)To reproduce
Write a query such as this:
Using the
steampipe query
tool, run the query like this:Expected behavior The query should return results without an error.
Additional context The error can be avoided by adding the wildcard column selector (aka:
*
). For exaxmpleIn this case, the results are successful, but with more columns than intended.