motis-project / motis

Intermodal Mobility Information System
https://motis-project.de
MIT License
212 stars 47 forks source link

How to Troubleshoot Bus error with German DELFI GTFS? #400

Closed 1Maxnet1 closed 1 year ago

1Maxnet1 commented 1 year ago

Hi, I wanted to setup a Germany covering motis Instance with the DELFI GTFS data and the German OSM data. However when I try to start it in the step where the nigiri module is running, it fails with an Bus error. As no further information is given, I do not know how to troubleshoot this. Also there is no core-file created. Do you have any hints?

The server is running in a Ubuntu 22.04 LXC. If you could use any further details let me know. Thanks in advance

1Maxnet1 commented 1 year ago

Ah I just re-ran it and now it sais:

terminate called after throwing an instance of 'std::bad_cast'
  what():  std::bad_cast
Aborted

That does not help me, but it may help you.

felixguendling commented 1 year ago

Can you post the config.ini + system specs please?

1Maxnet1 commented 1 year ago

Can you post the config.ini + system specs please?

Content of the config.ini:

modules=intermodal
modules=ppr
modules=parking
modules=osrm
modules=address
modules=nigiri
modules=tiles

intermodal.router=nigiri
server.static_path=/usr/bin/motis-de/web

[tiles]
profile=/usr/bin/motis-de/motis/tiles-profiles/background.lua

[dataset]
no_schedule=true

[import]
data_dir=/data/motis-de
paths=schedule-germany:/data/motis-de/gtfs
paths=osm:/data/motis-de/germany-latest.osm.pbf

[osrm]
profiles=/usr/bin/motis-de/motis/osrm-profiles/car.lua
profiles=/usr/bin/motis-de/motis/osrm-profiles/bike.lua
profiles=/usr/bin/motis-de/motis/osrm-profiles/bus.lua

[nigiri]
first_day=TODAY
num_days=2
default_timezone=Europe/Berlin

Output of inxi -Fxz:

buchholzm@motis-de:~$ inxi -Fxz
System:
  Kernel: 6.2.16-3-pve x86_64 bits: 64 compiler: gcc v: 12.2.0 Console: pty pts/3
    Distro: Ubuntu 22.04.3 LTS (Jammy Jellyfish)
Machine:
  Type: Desktop Mobo: ASRock model: B450 Pro4 serial: <superuser required>
    UEFI: American Megatrends v: P8.02 date: 02/06/2023
CPU:
  Info: quad core model: AMD Ryzen 5 3400G with Radeon Vega Graphics bits: 64 type: MT MCP
    arch: Zen/Zen+ note: check rev: 1 cache: L1: 384 KiB L2: 2 MiB L3: 4 MiB
  Speed (MHz): avg: 2730 high: 3700 min/max: 1400/3700 boost: enabled cores: 1: 2326 2: 2886
    3: 2930 4: 2883 5: 1244 6: 2950 7: 2923 8: 3700 bogomips: 29599
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm
Graphics:
  Device-1: AMD Picasso/Raven 2 [Radeon Vega Series / Radeon Mobile Series] driver: amdgpu
    v: kernel bus-ID: 0b:00.0
  Display: server: No display server data found. Headless machine? tty: 206x57
    resolution: 1920x1080
  Message: GL data unavailable in console. Try -G --display
Audio:
  Device-1: AMD Raven/Raven2/Fenghuang HDMI/DP Audio driver: snd_hda_intel v: kernel
    bus-ID: 0b:00.1
  Device-2: AMD Family 17h HD Audio vendor: ASRock driver: snd_hda_intel v: kernel
    bus-ID: 0b:00.6
  Sound Server-1: ALSA v: k6.2.16-3-pve running: yes
Network:
  Device-1: Realtek RTL8125 2.5GbE driver: r8169 v: kernel port: e000 bus-ID: 05:00.0
  Device-2: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet vendor: ASRock driver: r8169
    v: kernel port: c000 bus-ID: 08:00.0
  IF-ID-1: bonding_masters state: N/A speed: N/A duplex: N/A mac: N/A
  IF-ID-2: eth0 state: up speed: 10000 Mbps duplex: full mac: <filter>
Drives:
  Local Storage: total: 5.91 TiB used: 47.48 GiB (0.8%)
  ID-1: /dev/nvme0n1 vendor: Crucial model: CT1000P3PSSD8 size: 931.51 GiB temp: 33.9 C
  ID-2: /dev/nvme1n1 vendor: Crucial model: CT1000P3SSD8 size: 931.51 GiB temp: 26.9 C
  ID-3: /dev/sda vendor: Samsung model: SSD 860 size: 465.76 GiB
  ID-4: /dev/sdb vendor: Western Digital model: WD40EZRZ-00G size: 3.64 TiB
Partition:
  ID-1: / size: 9.75 GiB used: 1.08 GiB (11.1%) fs: ext4 dev: /dev/pve-vm--111--disk--1
Swap:
  ID-1: swap-1 type: partition size: 8 GiB used: 13 MiB (0.2%) dev: N/A
Sensors:
  System Temperatures: cpu: N/A mobo: N/A gpu: amdgpu temp: 35.0 C
  Fan Speeds (RPM): N/A
Info:
  Processes: 24 Uptime: 16d 29m Memory: 29.29 GiB used: 79.5 MiB (0.3%) Init: systemd runlevel: 5
  Compilers: gcc: N/A Packages: 426 Shell: Bash v: 5.1.16 inxi: 3.3.13
felixguendling commented 1 year ago

I downloaded the latest release and the datasets referenced by you but I cannot reproduce your problem.

Since your setup seems to have the MOTIS files distributed over the whole system (and I don't want to clutter my system like that), I needed to change some paths:

modules=intermodal
modules=ppr
modules=parking
modules=osrm
modules=address
modules=nigiri
modules=tiles

intermodal.router=nigiri
server.static_path=motis/web

[tiles]
profile=motis/tiles-profiles/background.lua

[dataset]
no_schedule=true

[import]
paths=schedule-germany:20231002_fahrplaene_gesamtdeutschland_gtfs.zip
paths=osm:germany-latest.osm.pbf

[osrm]
profiles=motis/osrm-profiles/car.lua
#profiles=motis/osrm-profiles/bike.lua

[nigiri]
first_day=TODAY
num_days=2
default_timezone=Europe/Berlin

bus.lua OSRM profile is not needed without the path module. I commented out the bike.lua profile for now to speedup the process.

Maybe you build a release build with debug information enabled and start it with GDB, then you get a stack trace.

sudo apt install g++-12 cmake ninja git
git clone git@github.com:motis-project/motis.git
mkdir build
cd build
cmake -GNinja -DNO_BUILDCACHE=ON -DCMAKE_BUILD_TYPE=RelWithDebInfo ..
ninja motis

This creates a MOTIS binary with debug info which you can start with gdb --args ./motis -c config.ini.

The only other thing you can do is to disable modules step by step to find out which module is causing the problem.

felixguendling commented 1 year ago

If you're using network drives, this can lead to problems with memory mapped files that are used in MOTIS a lot. So the best solution is to have all files (input files and data files) on a local NVMe SSD for fast access.

felixguendling commented 1 year ago

Can you please monitor the memory usage during the import? It might be that your 29GB of memory are not sufficient for the setup you're trying to run. Do you have swap enabled in case the import might need temporarily slightly more memory ?

1Maxnet1 commented 1 year ago

Can you please monitor the memory usage during the import? It might be that your 29GB of memory are not sufficient for the setup you're trying to run.

At least now when I run it, it does stop already a second after starting it, without making use of a lot of memory:

     address: WAITING: {"OSM"}
      nigiri: [■                                                       ]   0% | RUNNING
   osrm-bike: WAITING: {"OSM"}
    osrm-bus: WAITING: {"OSM"}
    osrm-car: WAITING: {"OSM"}
     parking: WAITING: {"OSM", "PPR", "STATIONS"}
         ppr: WAITING: {"OSM"}
       tiles: WAITING: {"OSM"}
terminate called after throwing an instance of 'std::bad_cast'
  what():  std::bad_cast
Aborted

When I changed the config.ini last time, the Bus error happenend. After another start I can reliably reproduce the given error. So I removed the Bus.lua profile as suggested, but now I wasn't able to reproduce the bus error. but get the given error again.

If you're using network drives, this can lead to problems with memory mapped files that are used in MOTIS a lot. So the best solution is to have all files (input files and data files) on a local NVMe SSD for fast access.

We are not using network drives, so that should not be the issue.

Do you have swap enabled in case the import might need temporarily slightly more memory ?

Yes, another 32 GiB (or 29 GB), same as the amount of RAM. I will check out the debug binary and report back any findings. Thanks already for the hints.

1Maxnet1 commented 1 year ago

Quick update: I tried KIWI (kill it with iron) first and upgraded the LXC with 48 GB of RAM. However there does appear a segmentation fault and then the bus error. I'm going to build a debug build now to further debug the issue.

1Maxnet1 commented 1 year ago

I managed to get it up and running after disabling all modules except nigiri. Now I wanted to re-enable one module after another to see which one fails. However it seems that the some modules depend on each other. Is there anywhere documented which modules requires which other module loaded, so I do not need to guess or trial'n'error it?

PartTimeDataScientist commented 1 year ago

I have once created an overview of what the different modules do. While this doesn't actually show the dependencies it could give you a hint on what you actually need and what not as it at least marks the "deprecated" modules which are (or will be) replaced by the new nigiri core.

image

felixguendling commented 1 year ago

image

Probably not complete, but I hope it helps.

1Maxnet1 commented 1 year ago

Thank you two for the overviews, they already help a lot for a better understanding. What about adding them to the according wiki page maybe ? https://github.com/motis-project/motis/wiki/Modules

felixguendling commented 1 year ago

Thank you for the proposal. Added it.

Please let me know if you hit any problems regarding the debug build. Curious what's happening there.

1Maxnet1 commented 1 year ago

Here is another update:

The only other thing you can do is to disable modules step by step to find out which module is causing the problem.

As this seemed to be the easier approach for me, I disabled all modules and re-enabled the modules one by one. However now it works. The bus error is gone and all the modules are loaded without any issues. Nice on the one hand (issue fixed) but also not nice to not know what the issue was in the first place.

Can you please monitor the memory usage during the import? It might be that your 29GB of memory are not sufficient for the setup you're trying to run. Do you have swap enabled in case the import might need temporarily slightly more memory ?

I also monitored the memory, which wasn't an issue at all. I never saw it using the 32 GB. It peeked somewhere in the mid-twenties of RAM usage.

Please let me know if you hit any problems regarding the debug build. Curious what's happening there.

As I did not had another try, I have nothing to report. However as soon as I try to make a motis build (debug or not) and run into problems I will create an issue about it.

Thanks for the help and suggestions :)