hove-io / navitia

The open source software to build cool stuff with locomotion
https://www.navitia.io/
GNU Affero General Public License v3.0
434 stars 126 forks source link

Issue in DB feed #2762

Closed linusnorton closed 5 years ago

linusnorton commented 5 years ago

The DB dataset contains many stops with no stop_code. In addition the names are overly short and often overlap. For example:

StopArea:OULCTP72,,Hauptbahnhof,,48.399745,9.984147,,,1,,,

There are many Hauptbahnhof stops within the feed.

Is it possible for them to classified by city and for every stop to have a stop_code?

Many thanks for your excellent service.

linusnorton commented 5 years ago

I guess the stop code is not important but stop names need improving. If you can point me at either the code or the data source I can have a good at making a PR.

kinnou02 commented 5 years ago

Hi, sorry tor the delay,

In the api every stoparea should have a label that contains the name of the city, the current implementation is very french-centric, so only osm administrative region of level 8 are used. We are slowly migrating to cosmogony and mimirsbrunn.

The stop code is also present in the response, but isn't currently part of the label, you can find it in the codes of a stop_area or stop_point as gtfs_stop_code: image The stop code should probably be part of stop's label, we will think about it, as this kind of change would impact all of our customers

linusnorton commented 5 years ago

Thanks, hopefully the migration toto cosmogony and mimirsbrunn will help will the stop names.

pbougue commented 5 years ago

Also, note that those changes (cosmogony, mimirsbrunn, and maybe adding stop code to the label) would only impact the API's output, not the NTFS (or GTFS) itself. NTFS (or GTFS) are input of Navitia and are not meant to be changed on those issues on a short term. If you are using GTFS, you would have to do some processing yourself.

We are internally weighting pros and cons to identify what is best to do on the labels processing, but have no estimated date on the subject.

I see no special point to track here, so I'm closing this issue. Feel free to comment or reopen it if necessary or if you have a special use-case to detail :)

linusnorton commented 5 years ago

@pbougue ah okay, do you know what is generating the GTFS? I can raise the issue there

pbougue commented 5 years ago

@linusnorton Then that's a "data" issue.

I dug a bit into it, and it looks like it's linked to the merging of stop_areas from original datasets. For example on Ulm, we have the stop_area you mentioned that is from urban dataset: https://api.navitia.io/v1/coverage/de/stop_areas/stop_area%3AOUL%3ASA%3ACTP72 And the one from the DB dataset: https://api.navitia.io/v1/coverage/de/stop_areas/stop_area%3AOBF%3ASA%3ACTP8000170 Merging them should help for your issue.

I checked with our data-team, there are currently no merge done on the stop_areas. They are about to switch to a new tool on data pre-processing, and it should improve merging, although it will probably not be perfect on the first run.

Please note that we track data issues on our googlegroup (on github, it's more the API development). Our data team plan on communicating on this media when they migrate coverages to the new tool: https://groups.google.com/forum/#!forum/navitia

linusnorton commented 5 years ago

Thanks