derhuerst / db-stations

A list of Deutsche Bahn stations.
ISC License
25 stars 4 forks source link

stations in the HAFAS API, but missing in db-stations #6

Open derhuerst opened 7 years ago

derhuerst commented 7 years ago

When querying routes from 8011167 to 8000261, I get the following station as start location:

{
  type: 'station',
  name: 'Berlin Jungfernheide',
  latitude: 52.530408,
  longitude: 13.299424,
  id: 8011167,
  platform: '4'
}

The id 8011167 does not occur in db-stations@1.0.0, taken from the stations API.

PS: @lightsprint09 @highsource @juliuste any idea? :wink: In the long term, I'd guess the whole open data community would appreciate it if the API really contained all stations that DB provides routes for.

derhuerst commented 7 years ago

db-stations@0.4.0 contains the following entry, containing the id:

{
  "ds100": "BJUF",
  "nr": 3067,
  "name": "Berlin Jungfernheide",
  "zip": "10589",
  "city": "Berlin",
  "state": "BE",
  "id": 8011167,
  "latitude": 52.530276,
  "longitude": 13.299437
}

More specifically:

curl -sL 'http://download-data.deutschebahn.com/static/datasets/haltestellen/D_Bahnhof_2016_01_alle.csv' | grep 8011167
# 8011167;BJUF;Berlin Jungfernheide;RV;13.299437;52.530276;;
highsource commented 7 years ago

I've forwarded this to someone in DB S&S.

voland10557 commented 7 years ago

I don't get what exactly the problem is, but here's some background info:

DB S&S has around 5400 train stations ("Bahnhöfe" + "Haltepunkte"): https://data.deutschebahn.com/dataset/data-stationsdaten

This list contains around 6600 stops (also tram stops, bus stops, ...), but NOT only train stations, so it's not really Deutsche Bahn data (and I bet DB isn't keeping it up to date): https://data.deutschebahn.com/dataset/data-haltestellen Please see Verkehrsrots post in the comment section for more info.

So if you want to work with train stations data (DB S&S stations), you should work with the static "data-stationsdaten" or StaDa-API: https://developer.deutschebahn.com/store/apis/info?name=StaDa-Station_Data&version=v2&provider=DBOpenData

And of course the data of DB RegioNetz Infrastruktur (RNI) stations (that is not part of DB S&S): https://data.deutschebahn.com/dataset/data-stationsdaten-regio

derhuerst commented 7 years ago

@voland10557 Thanks for the explanation!

Let me first outline the (ideological) standpoint of a non-DB programmer/user:

<rant>

I'm trying to work with DB data. I, quite frankly, don't care which subdivision of Deutsche Bahn is responsible for which stations. I just want to consume the data. I don't intend to guess the dozens of DB internal abbreviations.

As a someone from the outside, I can't reproduce why the local public transport data (buses, local trains, etc.) is outdated, especially since Deutsche Bahn afair tries to cover them with routing information. So if DB covers an area/agency, it's job is to have correct data about it.

DB has, from my experience, a pretty long record of technical & legal excuses to provide a subpar user experience, both to end-users & to programmers. In the context of making long-distance public transport more convenient, to ultimately sell more tickets, it would make sense to overcome these limitations.

Just like me, the 300 other community programmers & 200 other companies out there don't intend to figure out all the weird abbreviations. They don't intend to stick together 3 inconsistent datasets. They don't intend to make their programs extra-faulproof for cases like IDs or names missing completely.

</rant>

Now the perspective of me, someone more motivated and technically involved:

derhuerst commented 7 years ago

My intention is to have a module db-stations that contains every station that DB has routing information for or, as in the case above, returns when queried for a different station. It would probably be useful to have flags for the following properties:

derhuerst commented 7 years ago

@tursics

derhuerst commented 6 years ago

FYI this issue still prevails.