openplannerteam / itinero-transit

A transit routing module for Itinero using Linked Connections.
Other
6 stars 1 forks source link

Stations are overwritten in transitdb #46

Closed pietervdvn closed 5 years ago

pietervdvn commented 5 years ago

When inspecting the transitDB, it turns out that some stations are mentioned twice - each time with slightly different coordinates.

The data is:

http://irail.be/stations/NMBS/008722326 50.6361112310626 3.07151556015015 (90300555 0) Lille Europe   Rijsel-Europa    
http://irail.be/stations/NMBS/008722326 50.6333346759452 3.06666612625122 (90300555 1) Lille Europe   Rijsel-Europa    

There is however only one entry in the upstream data with this ID:

50  
@id "http://irail.be/stations/NMBS/008722326"
alternative […]
avgStopTimes "0"
country "http://sws.geonames.org/3017382/"
http://www.w3.org/2003/01/geo/wgs84_pos#lat "50.636108"
http://www.w3.org/2003/01/geo/wgs84_pos#long "3.071516"
name "Lille Europe"

The other coordinates can be found in the data too. They turn out to be the coordinates of Lille-Flandres, which are missing in the database dump:

54  
@id "http://irail.be/stations/NMBS/008728600"
alternative […]
avgStopTimes "62.412698"
country "http://sws.geonames.org/3017382/"
http://www.w3.org/2003/01/geo/wgs84_pos#lat "50.633333"
http://www.w3.org/2003/01/geo/wgs84_pos#long "3.066669"
name "Lille Flandres"

Full database

A spreadsheet of the full database can be found here:

locations-duplicates.xlsx

(Sadly, github does not support .ods; perhaps because of the new owners?)

pietervdvn commented 5 years ago

To reproduce:

# In the IDP-repo
git checkout features/transit
# In the source dir
dotnet run --create-transit-db https://graph.irail.be/sncb/connections https://irail.be/stations/NMBS duration=0 --dump-locations > locations.csv
pietervdvn commented 5 years ago

It seems like this only happens with stops physically close by another; Brussel-Kappellekerke is another one - which gives an explanation on the overly high popularity of the station.

xivk commented 5 years ago

Ah this is interesting, this could be a critical bug causing some of what we're seeing! Good job figuring this out! :+1:

pietervdvn commented 5 years ago

I have added an unit test reproducing the behaviour:

https://github.com/openplannerteam/itinero-transit/blob/cf726d9584fd9cd30345178daf4fe93970fa2722/test/Itinero.Transit.Tests/Data/StopsDbTests.cs#L151

pietervdvn commented 5 years ago

Try changing one of the coordinates. If both station are further away from each other, the test passes.

pietervdvn commented 5 years ago

The unit test is move to branch bugfixes/stations-46 as it is blocking the build.