hove-io / navitia

The open source software to build cool stuff with locomotion
https://www.navitia.io/
GNU Affero General Public License v3.0
428 stars 127 forks source link

Coverage shape : how is it computed ? #3754

Closed christopheblin closed 2 years ago

christopheblin commented 2 years ago

Describe the bug When asking the coverage, navitia provides a shape : how is it computed ?

I'd like to know which data are used and if there may be some "leak" between 2 coverages

Here are 2 shapes of 2 coverages we have in our platform

Capture d’écran 2022-05-24 à 14 16 03

However, all the places (both pois and stops) we have in the first coverage are NOT near Lille

I check with the following query and 400kms range (red point is approximate)

/v1/coverage/c1/coord/3.07630%3B50.64114/places_nearby?distance=400000&count=5000&

Capture d’écran 2022-05-24 à 15 05 57

And in the first coverage, there are no points near Swiss

And in the second covereage, there are no points near Swiss either (but we have lots of points near Lille in this case)

-> why is the shape of the first coverage including some points near Belgium / Lille / Swiss ? and why is the shape of the second coverage including sme points near swiss ?

could it be a "leak" from another coverage ? if so, is there a way to diagnose that problem ?

To Reproduce Cannot reproduce on navitia.io, that is why this is more a question than a bug

Expected behavior Understand why the coverage shape is not around the places in the pbf / ntfs

Screenshots If applicable, add screenshots or any resources likely to help us understanding your problem.

SGrenet commented 2 years ago

Hello,

I'll try to give explanations :

Coverage's shape is only used to define the coverage area (informative or visual use) and some other uses like :

Otherwise, PT objects are collected with ntfs for stops and poi.poi for POIs.

For exemple, our South-West open coverage has stop_areas outside coverage's shape (we need to have train stations for long distance) and urban ones in shape (preferentially but no obligation). Same for POIs

NTFS.zip files : try : http://playground.navitia.io/play.html?request=http%3A%2F%2Fapi.navitia.io%2Fv1%2Fcoverage%2Ffr-sw%2Fcoord%2F3.07630%253B50.64114%2Fplaces_nearby%3F -> we have 3 train stations vs North East : http://playground.navitia.io/play.html?request=http%3A%2F%2Fapi.navitia.io%2Fv1%2Fcoverage%2Ffr-ne%2Fcoord%2F3.07630%253B50.64114%2Fplaces_nearby%3F ->we have 170 stops (train, metro, tram and bus)

POI.poi files But no POIs (by choice) near Lille with SW quarter : http://playground.navitia.io/play.html?request=http%3A%2F%2Fapi.navitia.io%2Fv1%2Fcoverage%2Ffr-sw%2Fcoord%2F3.07630%253B50.64114%2Fplaces_nearby%3Fdistance%3D100000%26type%255B%255D%3Dpoi%26

OSM.pbf files For streets, OSM extracts are filtered before pbf import with the same, or a different shape/poly

Administrative_regions are extract if at east one object is present

=> All objects are extracted before

shape.poly files : If your question is about automatic shape calculation, it's a good question, and I don't have answer : we add our proper shapes like other objects with .poly files and we have some coverages with no shape too. Then, I don't think we add automatic shape calculation as a feature or I don't know is existance

You've never uploaded .poly file before?

Sylvain

christopheblin commented 2 years ago

@SGrenet thanks for the detailed explanation, I really appreciate

I can confirm that I only uploaded a coverage.osm.pbf + coverage.zip to tyr, so it seems there IS some kind of automatic shape calculation somewhere

I did reproduce the problem locally on a fresh install of navitia with only one of the coverages

  1. it does not seem to be related to a coverage leak since I have a single coverage
  2. it seems there is a problem in the automatic shape calculation

Could you give me a pointer to find the automatic calculation ?

Could you point me to the doc that explains how to provide my own poly file for a coverage (based on your explanation, I understand it IS possible but I dont find how)

More details about local repro

The pbf is from geofabrik.de in europe/france/bretagne-latest.osm.pbf

The coverage.zip (cannot remember on the top of my head how we create it but data come from data.gouv.fr) : coverage.zip

NAVITIA_VERSION=15.1.1

% curl -F file=@../../Downloads/coverage.osm.pbf -X POST http://localhost:81/v0/jobs/default {"message": "OK"} % curl -F file=@../../Downloads/coverage.zip -X POST http://localhost:81/v0/jobs/default
{"message": "OK"}

Capture d’écran 2022-05-25 à 08 59 16

The stop_ppints map (all points are included thanks to count=4000)

Capture d’écran 2022-05-25 à 09 18 01

SGrenet commented 2 years ago

Hello,

ok, you're right there is an auto shape calculation, thanks for discover 👍

how it is calculated without shape_file.poly ? shape is stocked in DB, navitia.parameters column shape DB function create shape from table georef.node result is probably boundaries issued from osm

try opening your bretagne.pbf in QGIS to control it.

overriding shape in navitia.parameters see tyr/binarization.py line 517 tyr compute poly_file.poly and override the table navitia.parameters column shape

shape_file.poly are .poly exports of JOSM a sample : sample_poly_file.poly.txt

explore your tyr {tyr_url}/v0/instances/c1/last_datasets a sample with pbf, ntfs, poi, poly last_datasets.json.txt

our internal use is allways adding poly_file.poly, then forgot the auto-shape

Sylvain

christopheblin commented 2 years ago

@SGrenet thanks for confirmation (and glad for the discover 😄 ) !

I tried to open the file in QGIS but the file seems too big ... 😞 -> do you have another idea to control the file ?

otherwise, uploading a poly file to tyr does not seem to work for me (see logs below)

logs of failing poly file in tyr worker

[2022-05-25 13:44:19,984] [ INFO] [    1] [   celery.worker.strategy] Received task: tyr.binarisation.shape2ed[d9e5d1f3-e78f-46d6-ad74-75ad794e59c4]  
[2022-05-25 13:44:19,988] [DEBUG] [   17] [                     root] args: (<@task: tyr.binarisation.shape2ed of tyr at 0x7fa6a24b9490>, <tyr.helper.InstanceConfig object at 0x7fa69bf51650>, '/home/tyr/ed/backup/default/20220525-134419959433/coverage.poly') -- kwargs: {'dataset_uid': '6bef2708-d106-48f4-b9ce-6dbd1decb4e8', 'job_id': 4}
[2022-05-25 13:44:19,997] [DEBUG] [   17] [         instance.default] lock acquired on default for shape2ed
[2022-05-25 13:44:19,999] [DEBUG] [   17] [         tyr.binarisation] Retrieved dataset: 4
[2022-05-25 13:44:20,007] [ INFO] [   17] [                     root] loading bounding shape for default from = /home/tyr/ed/backup/default/20220525-134419959433/coverage.poly
[2022-05-25 13:44:20,007] [ INFO] [   17] [                     root] loading bounding shape for default from = /home/tyr/ed/backup/default/20220525-134419959433/coverage.poly
[2022-05-25 13:44:20,073] [DEBUG] [   17] [         instance.default] release lock on default for shape2ed
[2022-05-25 13:44:20,076] [ INFO] [    1] [   celery.worker.strategy] Received task: tyr.binarisation.ed2nav[3ce57fc8-b169-4dac-909f-b948ac3e22fa]  
[2022-05-25 13:44:20,076] [ INFO] [   17] [         celery.app.trace] Task tyr.binarisation.shape2ed[d9e5d1f3-e78f-46d6-ad74-75ad794e59c4] succeeded in 0.0879292460013s: None
[2022-05-25 13:44:20,080] [DEBUG] [   14] [                     root] args: (<@task: tyr.binarisation.ed2nav of tyr at 0x7fa6a24b9490>, <tyr.helper.InstanceConfig object at 0x7fa69c1c1a90>, 4, None) -- kwargs: {}
[2022-05-25 13:44:20,088] [DEBUG] [   14] [         instance.default] lock acquired on default for ed2nav
[2022-05-25 13:44:20,088] [ INFO] [   14] [         instance.default] Launching ed2nav -o /home/tyr/ed/output/default.nav.lz4 --connection-string xxxxxxxxx --cities-connection-string xxxxxxxxx --local_syslog --log_comment default
[2022-05-25 13:45:42,624] [ERROR] [   14] [         instance.default] 
Traceback (most recent call last):
  File "/usr/src/app/tyr/binarisation.py", line 619, in ed2nav
    raise ValueError('ed2nav failed')
ValueError: ed2nav failed
[2022-05-25 13:45:42,667] [DEBUG] [   14] [         instance.default] release lock on default for ed2nav
[2022-05-25 13:45:42,675] [ERROR] [   14] [         celery.app.trace] Task tyr.binarisation.ed2nav[3ce57fc8-b169-4dac-909f-b948ac3e22fa] raised unexpected: ValueError('ed2nav failed',)
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/site-packages/celery/app/trace.py", line 374, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/usr/src/app/tyr/helper.py", line 98, in __call__
    return TaskBase.__call__(self, *args, **kwargs)
  File "/usr/local/lib/python2.7/site-packages/celery/app/trace.py", line 629, in __protected_call__
    return self.run(*args, **kwargs)
  File "/usr/src/app/tyr/binarisation.py", line 165, in wrapper
    return func(*args, **kwargs)
  File "/usr/src/app/tyr/binarisation.py", line 619, in ed2nav
    raise ValueError('ed2nav failed')
ValueError: ed2nav failed
SGrenet commented 2 years ago

@christopheblin

For information, I have open The pbf is from geofabrik.de in europe/france/bretagne-latest.osm.pbf in QGIS, not totally, but enough to understand your screenshots : near Lille and Lyon some objects are presents, maybe those we use in navitia (screenshot don't upload)

In our data process, osm.pbf are re-shaped before tyr

For your error, it's difficult without files and context to debug, you have used bretagne.poly from geofabrik?

christopheblin commented 2 years ago

@SGrenet To reproduce the bug,

-> everything works except last ed2nav, the shape of the coverage is still the old one

If you need more details, dont hesitate to ask

christopheblin commented 2 years ago

@SGrenet as a workaround, I "restricted" the pbf before sending it to tyr with osmconvert

osmconvert /data/provence-alpes-cote-d-azur-latest.osm.pbf -b=3.8540346036,42.9868292598,7.8320793509,45.28323205 --out-pbf --statistics > coverage.osm.pbf

I dont upload the poly file and navitia auto-computes a shape that seems correct

I still thinks there is a problem when uploading the poly file after the osm and ntfs

pbougue commented 2 years ago

Hello,

In my memory, Navitia auto-computes a convex envelope from OSM's data, which seemed to be a correct computation from what @SGrenet checked. My memory is also that stops are not used as we regularly add airports from far away but don't want them in the shape. I'm curious of why the bretagne pbf contains data from so far, but it's an issue on OSM's side.

So auto-computation works as expected, and the only problem you seem to have is the ingestion of poly file, if I understand you correctly.

We tried ingesting the poly file from bretagne on our servers, and it worked, so there doesn't seem to be an issue with that.

Note: if you restrict osm files a bit abruptly, you might end having issues with incorrect administrative regions, so be careful (shouldn't result in a crash in my memory though). Also, maybe try using a very simple poly file (square) to debug if an issue arises.

I will close the issue as the first question is covered, and it doesn't seem possible to reproduce a bug on our side, but feel free to reopen it with more details and a course of actions from scratch to reproduce it.