pgRouting / osm2pgrouting

Import tool for OpenStreetMap data to pgRouting database
https://pgrouting.org
GNU General Public License v2.0
286 stars 113 forks source link

Change 'gid' column in 'ways' table to id' #290

Open skyNacho opened 3 years ago

skyNacho commented 3 years ago

Problem

After importing OSM data using osm2pgrouting, path calculations with pgRouting give an error: column "id" not found. Need to change by hand from "gid" in table "ways" to "id" to make it work.

To Reproduce

Just follow the instructions in the documentation (https://github.com/pgRouting/osm2pgrouting -> "how to use it" and https://docs.pgrouting.org/latest/en/pgRouting-concepts.html#getting-started)

Expectation

Following the instructions step by step, the process should work out of the box, without undocumented errors.

Sample Data

Platform/versions

PostgreSQL 11.11 PostGIS 3.1.1 pgRouting 3.1.0

cayetanobv commented 3 years ago

Yes, you are right. Some time ago the field name was id. This commit changed the name to avoid other problems: https://github.com/pgRouting/osm2pgrouting/commit/07f16346d0aa26a7fbf41818b0e1f89287fa4a89#diff-45d504039e209ed5a5031dc80d94a3a310eaccce17f1ce5ff57548bb02cb2040

Documentation should be changed in pgRouting docs. You could propose the change opening a PR.

Thanks @skyNacho

dkastl commented 3 years ago

It's correct that sometimes you see gid and sometimes id in spatial datasets, and we didn't really declare one as "default" in pgRouting, probably because pgRouting is supposed to work with any network data and the SQL text argument in pgrouting functions allows you to easily adjust the column name with gid AS id.

skyNacho commented 3 years ago

Understood. In that case I will close this issue and will improve the documentation to avoid confusion for new users.

cvvergara commented 3 years ago

I am opening again.

Yes, you are right. Some time ago the field name was id.

More long time ago was gid https://github.com/pgRouting/osm2pgrouting/blob/osm2pgrouting-2.0.0/src/Export2DB.cpp#L84

On the rewrite I changed it to id

This commit changed the name to avoid other problems:

I made that commit, the problems were that osm2pgrouting users were using with gid for many years before and it broke code So I had to turn the id to gid again :-(

The default on pgRouting's inner query is id for the edges identifier, and the standard name for geometry (PostGIS) is geom osm2pgrouting creates gid and the_geom, instead of id and geom Before I arrived to pgRouting the_geom was used also in PostGIS, but now it follows a standard which is geom, and that was not changed on osm2pgroutingand not changed on the topology functions of pgRouting

But back to the id name Examples: If id is used, then the following query can be used as inner query

SELECT * FROM ways;
SELECT * FROM pgr_dijkstra(`SELECT * FROM ways`, 1, 2)

if gid is used, then this is compulsory:

SELECT gid AS id, source, target, cost, reverse_cost FROM ways;
SELECT * FROM pgr_dijkstra(`SELECT gid AS id, source, target, cost, reverse_cost FROM ways`, 1, 2)

Maybe have a configuration file with column names.

As new users might want id to avoid doing gid AS id or to avoid renaming the column with ALTER TABLE For example, if the user wants to use identificador for the id column and desde for the source column, hasta for the target column for example then the inner queries would be: SELECT gid AS id, fuente AS source, hasta AS target ....

dkastl commented 3 years ago

Well, I don't know that well about the gid vs. id history ;-)

But somehow I like in our documentation that we always write like

SELECT * FROM pgr_dijkstra(`SELECT gid AS id, source, target, cost, reverse_cost FROM ways`, 1, 2)

Because it shows in a very simple example, that the column names in a network table can actually have different names, and to make them fit you just need to use ALIAS. I think as a PostgreSQL beginner I would have first renamed all my column names to match.

So I don't think it's possible to find a common default anymore, neither for the_geom vs. geom nor id vs gid. But I think that's OK. However, I think we can write our documentation better, and I find the feedback of @skyNacho very valuable, because many things that may be obvious for us, who know pgRouting for a very long time, are maybe mysterious or unclear for new users.

cvvergara commented 3 years ago

We dont use gid in the documentation: For example: https://docs.pgrouting.org/3.2/en/pgr_withPoints.html#one-to-one

SELECT * FROM pgr_withPoints(
    'SELECT id, source, target, cost, reverse_cost FROM edge_table ORDER BY id',
    'SELECT pid, edge_id, fraction, side from pointsOfInterest',
    -1, 3,
    details := true);
skyNacho commented 3 years ago

To me it was confusing to follow the installation documentation and hitting an error when pgr_dijsktra could not find the 'id' column. So, either all documentation is reviewed and updated to ensure examples work out-of-the-box, or the naming convention in osm2pgrouting is modified so that tables work out-of-the-box (this would require reviewing all documentation anyway).

I understand @dkastl 's point to showcase the flexibility of pgRouting with regards to column names. However, forcing all queries to use alias seems an over complication. My instinct would be to adopt PostGIS's naming as the standard. But I am unable to evaluate the implications in terms of documentation review, updates in production deployments, modifications in third-party software and so on.

cvvergara commented 3 years ago

@skyNacho What about to make it configurable?

cvvergara commented 3 years ago

From my perspective making it configurable, the pgRouting user opts to use names that are not by default on pgRouting

skyNacho commented 3 years ago

That could be a suitable solution. It could have a default that simplifies operation but provide flexibility for cases where the naming conventions are different for whatever reason.

Do you have an example of how the configuration would be like?

cvvergara commented 3 years ago

Currently we have xml for configuration https://github.com/pgRouting/osm2pgrouting/blob/osm2pgrouting-2.0.0/mapconfig.xml As I am not a front-end person, and there is a plan for a complete rewrite, I would need to opinions of json or xml for the rewrite

but following the xml format, to something like this db_config_default.xml

<?xml version="1.0" encoding="UTF-8"?>
<dbconfiguration>
  <column name="id"  alias="id">
  <column name="source"  alias="source">
  <column name="target"  alias="target">
  <column name="cost"  alias="cost">
  <column name="reverse_cost"  alias="reverse_cost">
  <column name="geom"  alias="geom">
  ....
</dbconfiguration>

So an old style user of som2pgrouting would change to gid and geom db_config_oldstyle.xml

<?xml version="1.0" encoding="UTF-8"?>
<dbconfiguration>
  <column name="id"  alias="gid">
  <column name="source"  alias="source">
  <column name="target"  alias="target">
  <column name="cost"  alias="cost">
  <column name="reverse_cost"  alias="reverse_cost">
  <column name="geom"  alias="the_geom">
  ....
</dbconfiguration>

maybe a spanish would create mi_configuracion_de_base_de_datos.xml

<?xml version="1.0" encoding="UTF-8"?>
<dbconfiguration>
  <column name="id"  alias="id">
  <column name="source"  alias="desde">
  <column name="target"  alias="hasta">
  <column name="cost"  alias="costo">
  <column name="reverse_cost"  alias="costo_regreso">
  <column name="geom"  alias="geometria">
  ....
</dbconfiguration>
skyNacho commented 3 years ago

I think it makes sense. So, the github repository would provide different xml options and one can decide which one to choose or even to adapt it if needed. I guess this change would not affect existing deployments of pgRouting, but only new OSM->pgRouting data imports with a new version of osm2pgrouting, right?

Regarding the question on json vs xml, although I tend to favor json, if xml is already in use for configuration I would stick to that.

cvvergara commented 3 years ago

Yes, it would require a new major version. because it would break code, maybe scripts that already exist, would need to be changed

skyNacho commented 3 years ago

Ok, so we should still review the documentation anyway to make sure gid AS id appears, at least in the introductory samples. I can take care of that.

skyNacho commented 3 years ago

I just sent a pull request with minimal changes to pgRouting concepts section. I believe that should be enough for new users to understand the use of aliases in pgRouting queries.