tds-fdw / tds_fdw

A PostgreSQL foreign data wrapper to connect to TDS databases (Sybase and Microsoft SQL Server)
Other
380 stars 102 forks source link

Joining three tables returns no rows (or one row) #149

Open RoyWiggins opened 6 years ago

RoyWiggins commented 6 years ago

This is the query I am running in the actual Microsoft Sql Server database, which works well:

select * from Foo join FooRead on Foo.FooId = FooRead.FooId join Asset on Asset.AssetId = Foo.AssetId

The equivalent in postgres, through tds_fdw, is returning only one row when there should be many.

select * from "Foo" as foo
    join "FooRead" as foo_read on foo."FooId"::int = foo_read."FooId"::int
    join "Asset" asset on asset."AssetId" = foo."AssetId";

Each joining Foo and Asset, or Foo and FooRead works fine on its own and composing them separately works:

select * from 
 ( select * from "Foo" as foo join "Asset" as asset on asset."AssetId" = foo."AssetId" ) asset_foos
     join "FooRead" as foo_read on
          asset_foos."FooId"::int = foo_read."FooId"::int

Do you have any ideas why it might be doing this?

RoyWiggins commented 6 years ago

Update: So, this started working again with no intervention. However, now WHERE clauses after queries sometimes return no data, even when an un-WHERE'd query returns rows that absolutely 100% satisfy the WHERE. Ex something like this might work:

select * from "Foo" as foo
    join "FooRead" as foo_read on foo."FooId"::int = foo_read."FooId"::int
    join "Asset" asset on asset."AssetId" = foo."AssetId"
    where foo.id = 56;

but setting foo.id = 57 doesn't, even though there emphatically are rows in the joined table that satisfy it. This is disconcerting and we're not sure how to debug it.

leftiness commented 6 years ago

I've faced similar problems. The gist is that sometimes I get one record selected from an MSSQL foreign table when I know there are more. It has weird downstream results when applying other joins and wheres.

I haven't found a simple way to reproduce the problem. The failures come from somewhat more complex queries. Not seriously more complex. Maybe just a few more inner joins and some more columns. I can't share real details. The brass would never approve it.

Here are some simplified workarounds.

  1. Replace left join where false with where not in
-- failure
select foo.id
from foo
left join bar
on bar.foo_id = foo.id
where bar.id is null
;

-- success
select foo.id
from foo
where foo.id not in (select foo_id from bar)
;
  1. Re-order inner joins
-- failure
select foo.id
from foo
inner join bar
on bar.foo_id = foo.id
;

-- success
select foo.id
from bar
inner join foo
on foo.id = bar.foo_id
;
  1. Inner join on self
-- failure
select foo.id
, foo.name
from foo
;

--- success
select foo.id
, foo2.name
from foo
inner join foo as foo2
on foo2.id = foo.id
;

Unfortunately, I don't think I can trust the workarounds, so I'll have to find an alternative.

RoyWiggins commented 6 years ago

We went with pulling the underlying tables over with several materialized views and querying those in Postgres. Since the tables aren't that large and we didn't need real-time data it was okay to just refresh the views overnight.

On Wed, Apr 11, 2018, 5:48 PM Brandon Parmenter notifications@github.com wrote:

I've faced similar problems. The gist is that sometimes I get one record selected from an MSSQL foreign table when I know there are more. It has weird downstream results when applying other joins and wheres.

I haven't found a simple way to reproduce the problem. The failures come from somewhat more complex queries. Not seriously more complex. Maybe just a few more inner joins and some more columns. I can't share real details. The brass would never approve it.

Here are some simplified workarounds.

  1. Replace left join where false with where not in

-- failureselect foo.idfrom fooleft join baron bar.foo_id = foo.idwhere bar.id is null ; -- successselect foo.idfrom foowhere foo.id not in (select foo_id from bar) ;

  1. Re-order inner joins

-- failureselect foo.idfrom fooinner join baron bar.foo_id = foo.id ; -- successselect foo.idfrom barinner join fooon foo.id = bar.foo_id ;

  1. Inner join on self

-- failureselect foo.id , foo.namefrom foo ; --- successselect foo.id , foo2.namefrom fooinner join foo as foo2on foo2.id = foo.id ;

Unfortunately, I don't think I can trust the workarounds, so I'll have to find an alternative.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/tds-fdw/tds_fdw/issues/149#issuecomment-380606539, or mute the thread https://github.com/notifications/unsubscribe-auth/AFC7Yjo17Shq4YD4doZZLmlB0vuswlHpks5tnnoqgaJpZM4Q-ANO .

On Wed, Apr 11, 2018, 5:48 PM Brandon Parmenter notifications@github.com wrote:

I've faced similar problems. The gist is that sometimes I get one record selected from an MSSQL foreign table when I know there are more. It has weird downstream results when applying other joins and wheres.

I haven't found a simple way to reproduce the problem. The failures come from somewhat more complex queries. Not seriously more complex. Maybe just a few more inner joins and some more columns. I can't share real details. The brass would never approve it.

Here are some simplified workarounds.

  1. Replace left join where false with where not in

-- failureselect foo.idfrom fooleft join baron bar.foo_id = foo.idwhere bar.id is null ; -- successselect foo.idfrom foowhere foo.id not in (select foo_id from bar) ;

  1. Re-order inner joins

-- failureselect foo.idfrom fooinner join baron bar.foo_id = foo.id ; -- successselect foo.idfrom barinner join fooon foo.id = bar.foo_id ;

  1. Inner join on self

-- failureselect foo.id , foo.namefrom foo ; --- successselect foo.id , foo2.namefrom fooinner join foo as foo2on foo2.id = foo.id ;

Unfortunately, I don't think I can trust the workarounds, so I'll have to find an alternative.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/tds-fdw/tds_fdw/issues/149#issuecomment-380606539, or mute the thread https://github.com/notifications/unsubscribe-auth/AFC7Yjo17Shq4YD4doZZLmlB0vuswlHpks5tnnoqgaJpZM4Q-ANO .

leftiness commented 6 years ago

Mm. Thanks for sharing. :musical_note:

I thought about materialized views. I guess a FDW select-star works reliably, so maybe it's a good workaround. Certainly better as a convention than doing weird workarounds like I wrote above.

I wanted to use FDW to eliminate that data duplication in the first place. That's why I'm going to look for alternatives.

leftiness commented 6 years ago

I tried CartoDB/odbc_fdw. The bug is still there. I'm using the same freetds-dev 1.00.44-r0 on Alpine 3.7. Looks like I was wrong to suspect tds_fdw.

Also. I extracted the data from the MSSQL, loaded it into a different foreign Postgres, and did that query. The bug was gone.

I suspect freetds.