pramsey / pgsql-ogr-fdw

PostgreSQL foreign data wrapper for OGR
MIT License
239 stars 34 forks source link

Ability to rename the fid column #103

Open robe2 opened 8 years ago

robe2 commented 8 years ago

If this isn't possible, then ogr_fdw should at least throw an error telling you that.

Here is my situation. I have this SQLite database with primary keys and GDAL has this less than useful feature of renaming my primary keys in the database to fid.

This would be all fine if I could rename them back to the name they originally were.

I can rename other fields, but for some reason, when I try to rename the fid column, I just get NULLs in the fields back. I also tried using the original name and that didn't work either. I even made sure the casing was the same.

Here is an example -First tried this

CREATE FOREIGN TABLE testentry
   (testid bigint  OPTIONS (column_name 'fid'),
    testsheetid integer ,
    equipid integer ,
    testtypeid integer ,
    alarm integer ,
    comment character varying )
   SERVER fds_test_data
   OPTIONS (layer 'testentry');

Then I tried this:


CREATE FOREIGN TABLE testentry
   (testid bigint  OPTIONS (column_name 'testid'),
    testsheetid integer ,
    equipid integer ,
    testtypeid integer ,
    alarm integer ,
    comment character varying )
   SERVER fds_test_data
   OPTIONS (layer 'testentry');

I tried with ESRI shape files as well and that didn't work either, though in case of ESRI shape the FID is a made-up field anyway.

I have the same issue with MSSpatial which is why I always use the ODBC driver when I only care about the non-relational part. ODBC seems conveniently stupid enough not to notice I have primary keys that it can rename to fid.

pramsey commented 8 years ago

OK, so I looked at this. It would only be possible if we added another column-level OPTION that we use to declare "this column is the fid". Because right now the very presence of the name "fid" is being used for that purpose. Unfortunately, even if we add the OPTION, the column name "fid" will continue to have that special meaning, simply for backward compatibility reasons. But at least it would then be possible to have non-"fid" named columns.

robe2 commented 8 years ago

Not following why you need another option.
Why can't it check if OPTIONS (column_name = 'fid') or the configured column = fid to match?

or are you worried about people doing both? Having an fid column and also one they map to fid.

pramsey commented 8 years ago

"fid" isn't a name, it's a concept, it's just a concept I've mapping to a name because I need to know what PgSQL column is serving as the OGR FID, and I need to know it in the absence of an OGR connection to disambiguate the PgSQL table definition. In the presence of a "is fid" column option, it would be possible to have a non-fid column named "fid", yes. Though in the absence of such an option, the logic would have to default back to preferring the name "fid" as the fid column.

In the Oracle FDW driver, for example, there's an option to mark a particular column as the primary key. Which is kind of odd/unfortunate, since FDW doesn't allow you to declare primary keys using normal PgSQL mechanisms. But the Oracle driver needs that marker for similar reasons as us: if doing writable FDW you need to know in advance what unique key you're using to drive column alterations.

robe2 commented 8 years ago

So you'd rather define a concept of what is the fid column rather than relying on the column_name = "fid" because someone might want to use that to mean they want the column name "fid" (which happens to exist in their data)? Or is the idea, that then you can bring in the FID column as the real name (in this case testid) (and just flag it as is_fid)? which would be nice, but a breaking change.

When I do ogrinfo of my table in question I see this which does show me that testid is the the "FID column"

Layer name: testentry Geometry: None Feature Count: 8 Layer SRS WKT: (unknown) FID Column = testid testsheetid: Integer (0.0) equipid: Integer (0.0) testtypeid: Integer (0.0) alarm: Integer (0.0) comment: String (0.0) OGRFeature(testentry):2 testsheetid (Integer) = 1 equipid (Integer) = 1 testtypeid (Integer) = 1 alarm (Integer) = 2 comment (String) = high oil

Ideally I would rather it be brought in by its rightful name (and you just flag (column_name = 'fid') and use that as your "fid concept" There is a good chance if their column is called fid, then it is the "FID" in their dataset, and if it isn't, they are currently screwed as it is anyway.

Looking at the code (from my armchair), it seems only slightly more difficult to read fid from what user passes in (column_name = 'fid') than from pg column name except in case where there really is an fid.

The idea of a primary key having to be an integer/long integer at least in the mind of a database purist is already a broken concept. Sure often it is by those ORM nonsense things, but if you introduced this extra bit (YOU GET TO FLAG what you want to use as PRIMARY KEY) then you'd have to allow for that bit to not be an integer and then later for that bit to be made of a set of columns otherwise db people would still be unhappy and everyone would be confused. Or is that the path you are going toward - to allow non-integers as primary keys? which would be great by the way :+1:

The irony of this is that if the primary key (which really is a primary key) wasn't brought in as fid (if I had a dummer OGR driver), I wouldn't even have this problem since I could then read the most important part of my data by its meaningful name.

pramsey commented 7 years ago

Just a quick note that (a) there are no primary keys in FDW and (b) the "fid" concept is an OGR thing, and OGR expects all FIDs to be long ints. So thus they shall always be.

dncpax commented 5 years ago

I may late to the party but I think I'm running into this awkward ogr thing. So it seems to be impossible to get a column named id from a datasource, and keep it named id. For instance a sql server table with a column "id" is translated to "fid". That seems counter intuitive but the worst part for me is that it forces the rewriting of all sql one may have that uses column id.

Am I missing something?

Thanks.

pramsey commented 5 years ago

No, this is more-or-less exactly what we're talking about. OGR is seeing your primary key and saying "aha, that's the "fid"" and the FDW is then using that column name in the FDW table. As above, we'd need an option to allow FDW to flag a particular column as being the fid, which in the case of databases would probably get fiddly since they support non-integral PKs.

dncpax commented 5 years ago

ok, thanks for clarifying. That's a strange thing as seen from an end user... I'd be ok with having a requirement of an integer PK. This auto-renaming is a bit disconcerting. For now I'm just rewriting sql to use "fid" instead of "id"...

pramsey commented 5 years ago

Well, maybe it’s time to bit the bullet and make the change to allowing a column signifier to point out the PK instead of using the name. I’m not sure how much backwards compatibility will matter with these things.

dncpax commented 5 years ago

I would if I could... (meaning know how)