t-rex-tileserver / t-rex

t-rex is a vector tile server specialized on publishing MVT tiles from your own data
https://t-rex.tileserver.ch/
MIT License
545 stars 68 forks source link

Core dumped while generatig cache on Trex v0.13.0 or v0.14.0 #255

Closed dmitrykinakh closed 2 years ago

dmitrykinakh commented 2 years ago

After upgrading Trex to v0.13.0 or v0.14.0 the issue "core dumped" started appearing. Any idea what might be wrong or where to check logs? Note: on Trex v0.11.0 all is ok.

admin@server-st1-trex-master:/opt$ docker-compose run --name TL_99 --rm trex-master sh -c 't_rex generate --tileset property_records_county --overwrite true --minzoom 14 --maxzoom 18 --extent -80.3053,26.0571,-79.9860,26.2343 --config /storage/trex/config.toml;'
2021-08-31 12:16:15.496 INFO Reading configuration from '/storage/trex/config.toml'
2021-08-31 12:16:15.833 INFO Tile cache directory: /storage/tiles/tiles_cache
Generating tileset 'property_records_county'...
Level 18: 33552 / 33552 [=================================================================================]  
free(): invalid pointer
Aborted (core dumped)
pka commented 2 years ago

Could you add --loglevel debug?

BeAsTB112 commented 2 years ago

sure:

/opt$ docker-compose run --name TL_99 --rm trex-master sh -c 't_rex generate --tileset property_records_county --overwrite true --minzoom 14 --maxzoom 18 --extent -80.3053,26.0571,-79.9860,26.2343 --loglevel debug --config /storage/trex/lots.toml;' 2021-09-02 00:11:02.548 INFO Reading configuration from '/storage/trex/lots.toml'

[snap]

2021-09-02 00:11:02.705 DEBUG Query for layer 'lots_city_labels': SELECT geom,"id","place","lot","block" FROM (SELECT ST_PointOnSurface(geom) AS geom,"id","place","lot","block" FROM lots_city) AS _q WHERE geom && ST_MakeEnvelope($1-0.1875$5::FLOAT8,$2-0.1875$5::FLOAT8,$3+0.1875$5::FLOAT8,$4+0.1875$5::FLOAT8,3857) 2021-09-02 00:11:02.706 INFO Tile cache directory: /storage/tiles/tiles_cache 2021-09-02 00:11:02.706 DEBUG detect_data_columns for layer lots_city with sql None 2021-09-02 00:11:02.706 DEBUG Filecache.write /storage/tiles/tiles_cache/lots_city.json 2021-09-02 00:11:03.054 DEBUG Filecache.write /storage/tiles/tiles_cache/lots_city.style.json 2021-09-02 00:11:03.422 DEBUG detect_data_columns for layer lots_city with sql None 2021-09-02 00:11:03.423 DEBUG detect_data_columns for layer lots_city with sql None 2021-09-02 00:11:03.424 DEBUG Filecache.write /storage/tiles/tiles_cache/lots_city/metadata.json 2021-09-02 00:11:04.017 DEBUG detect_data_columns for layer lots_city_labels with sql Some("SELECT ST_PointOnSurface(geom) AS geom,\"id\",\"place\",\"lot\",\"block\" FROM lots_city") 2021-09-02 00:11:04.018 DEBUG Filecache.write /storage/tiles/tiles_cache/lots_city_labels.json 2021-09-02 00:11:04.552 DEBUG Filecache.write /storage/tiles/tiles_cache/lots_city_labels.style.json 2021-09-02 00:11:05.076 DEBUG detect_data_columns for layer lots_city_labels with sql Some("SELECT ST_PointOnSurface(geom) AS geom,\"id\",\"place\",\"lot\",\"block\" FROM lots_city") 2021-09-02 00:11:05.077 DEBUG detect_data_columns for layer lots_city_labels with sql Some("SELECT ST_PointOnSurface(geom) AS geom,\"id\",\"place\",\"lot\",\"block\" FROM lots_city") 2021-09-02 00:11:05.078 DEBUG Filecache.write /storage/tiles/tiles_cache/lots_city_labels/metadata.json

free(): invalid pointer Aborted (core dumped)

pka commented 2 years ago

Looks like the core dump is happening in the initialization phase before tile seeding begins? Invalid free() calls are usually done in C/C++ code, which could be GDAL in this case. ~Is the next layer a GDAL (file) layer?~ Edit: Are there any GDAL (file) layers?

dmitrykinakh commented 2 years ago

Are there any GDAL (file) layers?

@pka - are you interested in the *.toml file content? Not sure what I need to share with you.

pka commented 2 years ago

The toml file would be interesting in this case. If there aren't too many layers, finding the crashing layer by commenting out layers in the config file, would be the next step.

dmitrykinakh commented 2 years ago

I'll set up a time file with a single layer we use for most of our needs and let you know in 10 minutes if there are any issues.

dmitrykinakh commented 2 years ago

Here you have a config file with two layers only and a generation log that contain that core dumped issue. I'm also attaching a schema of DB table from where we are getting the data in the case that is helpful.

trex_logs.txt t-rex configuration.txt image

pka commented 2 years ago

In the latest log the core dump happens after finishing tile generation. Did you snip that away in the first log, or did the core dump happen earlier there?

dmitrykinakh commented 2 years ago

In the previous log, @BeAsTB112 did a mistake and tried to generate a layer that was not present in the selected toml file.

pka commented 2 years ago

I tried to reproduce this with the imported dataset from #254 (any news on this?), but didn't get the core dump.

Tried with a local t-rex installation and also Docker:

docker run -v $PWD:/var/data/in:ro sourcepole/t-rex:0.14.0 generate --tileset lots_city --overwrite true --minzoom 14 --maxzoom 14 --extent -80.469513,25.375781,-80.064049,25.986550 --loglevel debug --config issue255.toml

Remark: the sql = line in your config is ignored when [[tileset.layer.query]] is commented out.

Are you sure, that your container is 0.14.0?

pka commented 2 years ago

It seems that you've changed the entrypoint in your docker-compose.yml. What is your entrypoint and are there other special configurations of the service?

dmitrykinakh commented 2 years ago
admin@gridics-st1-trex-master:/opt$ cat docker-compose.yml 
version: "2"

services:
  trex-master:
    image: gridics/trex-master:0.14.0
    restart: "no"
    volumes:
      - "/storage:/storage"
    hostname: st1-trex-master
    container_name: trex-master

That is docker-compose file. This is dockerfile we use to build image


FROM ubuntu:20.04

ARG gitref
ARG uid

ENV RUSTUP_HOME=/usr/local/rustup \
    CARGO_HOME=/usr/local/cargo \
PATH=/usr/local/cargo/bin:$PATH

RUN \
apt-get update && \
DEBIAN_FRONTEND=noninteractive apt-get install -y -my --allow-unauthenticated  \
 ca-certificates \
 apt-utils \
 rsyslog \
 telnet \
 htop \
 bash-completion \
 libjpeg-dev \
 zlib1g-dev \
 python \
 python-setuptools \
 python-dev \
 python3-pip \
 python-simplejson \
 libboost-python-dev \
 software-properties-common  \
 build-essential \
 libssl-dev \
 libpcre3 \
 libpcre3-dev \
 make \
 vim

RUN pip install --upgrade pip
RUN \
     add-apt-repository -y ppa:ubuntugis/ppa && \
     apt-get update && \
     apt-get install -y libgdal26 curl wget git && \
     curl -O -L https://github.com/t-rex-tileserver/t-rex/releases/download/v0.14.0/t-rex_0.14.0_amd64.deb && \ 
     dpkg -i t-rex_0.14.0_amd64.deb

RUN \
    mkdir -p  /storage/trex

RUN pip install tilestache
RUN mkdir -p /storage/tiles/tiles_cache

#bash autocompletion
RUN echo "if [ -f /etc/bash_completion ]; then . /etc/bash_completion; fi" >> /etc/bash.bashrc

FYI: I've updated https://github.com/t-rex-tileserver/t-rex/issues/254

pka commented 2 years ago

I was able to reproduce the core dump with your Dockerfile. Seems using libgdal26 from ppa:ubuntugis/ppa causes the problem. After removing the lines

     add-apt-repository -y ppa:ubuntugis/ppa && \
     apt-get update && \

the core dump doesn't occur anymore.

This PPA is only recommended for older Ubuntu versions.

dmitrykinakh commented 2 years ago

Thanks a lot! After rebuilding the docker image without these 2 lines there is no more issue with core dumped. We can close this issue.