RAM exhaustion with 59,067,585-line, 102 GB JSONL file

marklit commented 1 year ago

I'm running the following on a system with 64 cores and 64 GB of RAM. After a few hours of running the application appears to exhaust all available memory and is terminated by the Kernel.

Is there any workaround for this?

$ tippecanoe \
    --coalesce-densest-as-needed \
    -zg \
    --extend-zooms-if-still-dropping \
    -e fcc_477 \
    out.geojson

For layer 0, using name "out"
59067585 features, 15368550538 bytes of geometry, 5954962706 bytes of separate metadata, 750974459 bytes of string pool
Choosing a maxzoom of -z8 for features about 1021 feet (311 meters) apart
Choosing a maxzoom of -z11 for resolution of about 138 feet (42 meters) within features
tile 0/0/0 size is 3193098 with detail 12, >500000
Going to try keeping the sparsest 14.09% of the features to make it fit
tile 0/0/0 size is 3105152 with detail 12, >500000
Going to try keeping the sparsest 2.04% of the features to make it fit
tile 0/0/0 size is 3009947 with detail 12, >500000
Going to try keeping the sparsest 0.31% of the features to make it fit
tile 0/0/0 size is 2295077 with detail 12, >500000
Going to try keeping the sparsest 0.06% of the features to make it fit
tile 0/0/0 size is 1182125 with detail 12, >500000
Going to try keeping the sparsest 0.02% of the features to make it fit
tile 0/0/0 size is 694395 with detail 12, >500000
Going to try keeping the sparsest 0.01% of the features to make it fit
tile 0/0/0 size is 544355 with detail 12, >500000
Going to try keeping the sparsest 0.01% of the features to make it fit
tile 1/0/0 has 314800 features, >200000
Going to try keeping the sparsest 57.18% of the features to make it fit
tile 1/0/0 has 272910 features, >200000
Going to try keeping the sparsest 37.71% of the features to make it fit
tile 1/0/0 has 272910 features, >200000
Going to try keeping the sparsest 24.87% of the features to make it fit
tile 1/0/0 has 272891 features, >200000
Going to try keeping the sparsest 16.41% of the features to make it fit
tile 1/0/0 has 272612 features, >200000
Going to try keeping the sparsest 10.83% of the features to make it fit
tile 1/0/0 has 271776 features, >200000
Going to try keeping the sparsest 7.17% of the features to make it fit
tile 1/0/0 has 269162 features, >200000
Going to try keeping the sparsest 4.80% of the features to make it fit
tile 1/0/0 has 263068 features, >200000
Going to try keeping the sparsest 3.28% of the features to make it fit
tile 1/0/0 has 253891 features, >200000
Going to try keeping the sparsest 2.33% of the features to make it fit
tile 1/0/0 has 240504 features, >200000
Going to try keeping the sparsest 1.74% of the features to make it fit
tile 1/0/0 has 225503 features, >200000
Going to try keeping the sparsest 1.39% of the features to make it fit
tile 1/0/0 has 212709 features, >200000
Going to try keeping the sparsest 1.18% of the features to make it fit
tile 1/0/0 has 201649 features, >200000
Going to try keeping the sparsest 1.05% of the features to make it fit
tile 1/0/0 size is 7764025 with detail 12, >500000
Going to try keeping the sparsest 0.06% of the features to make it fit
tile 1/0/0 size is 1993600 with detail 12, >500000
Going to try keeping the sparsest 0.01% of the features to make it fit
tile 1/0/0 size is 993399 with detail 12, >500000
Going to try keeping the sparsest 0.01% of the features to make it fit
tile 1/0/0 size is 771458 with detail 12, >500000
Going to try keeping the sparsest 0.00% of the features to make it fit
tile 1/0/0 size is 686409 with detail 12, >500000
Going to try keeping the sparsest 0.00% of the features to make it fit
tile 1/0/0 size is 645387 with detail 12, >500000
Going to try keeping the sparsest 0.00% of the features to make it fit
tile 1/0/0 size is 619394 with detail 12, >500000
Going to try keeping the sparsest 0.00% of the features to make it fit
tile 1/0/0 size is 600238 with detail 12, >500000
Going to try keeping the sparsest 0.00% of the features to make it fit
tile 1/0/0 size is 587567 with detail 12, >500000
Going to try keeping the sparsest 0.00% of the features to make it fit
Killed%  1/0/0

Here is an example record from the 102 GB JSONL file:

$ head -n1 out.geojson | jq -S .

{
  "features": {
    "geometry": {
      "coordinates": [
        [
          [
            -112.081989,
            33.491495
          ],
          [
            -112.080484,
            33.491484
          ],
          [
            -112.078974,
            33.49147
          ],
          [
            -112.078186,
            33.491464
          ],
          [
            -112.078179,
            33.492564
          ],
          [
            -112.07817,
            33.493845
          ],
          [
            -112.07817,
            33.494028
          ],
          [
            -112.078171,
            33.494847
          ],
          [
            -112.078538,
            33.494846
          ],
          [
            -112.079147,
            33.494853
          ],
          [
            -112.08012,
            33.494852
          ],
          [
            -112.080399,
            33.494852
          ],
          [
            -112.081093,
            33.494861
          ],
          [
            -112.081441,
            33.49487
          ],
          [
            -112.081442,
            33.494093
          ],
          [
            -112.081443,
            33.492424
          ],
          [
            -112.081452,
            33.492396
          ],
          [
            -112.081472,
            33.492373
          ],
          [
            -112.081501,
            33.492361
          ],
          [
            -112.081869,
            33.492311
          ],
          [
            -112.082009,
            33.49228
          ],
          [
            -112.081989,
            33.491495
          ]
        ]
      ],
      "type": "Polygon"
    },
    "properties": {
      "blockcode": "040131105012000",
      "business": "1",
      "consumer": "0",
      "dba_name": "MCI",
      "frn": "0010856284",
      "h3_7": "8729b6d06ffffff",
      "h3_8": "8829b6d061fffff",
      "h3_9": "8929b6d060fffff",
      "hoco_final": "Verizon Communications Inc.",
      "hoco_num": "131425",
      "holding_company_name": "Verizon Communications Inc.",
      "log_rec_num": "4737332",
      "max_ad_down": 0,
      "max_ad_up": 0,
      "provider_id": "76029",
      "provider_name": "Verizon Business Global LLC dba Verizon Business",
      "source": "fbd_us_without_satellite_jun2021_v1.csv",
      "state_abbr": "AZ",
      "tech_code": "30"
    },
    "type": "Feature"
  },
  "type": "FeatureCollection"
}

e-n-f commented 1 year ago

I don't immediately know what is going on here, but I have been making other improvements to Tippecanoe in https://github.com/felt/tippecanoe, including some that are meant to reduce memory consumption, so I would suggest trying with that version.

If you can share your data file, I can try to reproduce the problem myself.

marklit commented 1 year ago

I'm going to run that fork you mentioned and I'll report back with my results.

The dataset itself is the last 14 releases of the FCC's 'without satellite' 477 data.

https://www.fcc.gov/general/broadband-deployment-data-fcc-form-477

The 14 CSV files were converted into JSONL. At one point this data lived in my client's BigQuery instance and ST_GEOGFROMWKB was used to convert the WKB to a text-based version. I had to run that column through shapely and geojson before it was ready for tippecanoe.

I don't have a good way to share the ~20 GB compressed version of this dataset. It might be quicker to download the latest release, convert it to JSONL and then duplicate it 14 times.

marklit commented 1 year ago

Just to report back, I tried that fork and the job was killed after some time. The VM I ran it on has 64 GB.

marklit commented 1 year ago

I partitioned the 60M records on their H3 zoom level 1 values, this broke the records up into 44 files. They weren't even in size but it was the quickest way I could think of to break up the dataset.

GeoJSON Size	Filename
16G	`812abffffffffff`
14G	`81267ffffffffff`
11G	`8126fffffffffff`
7.8G	`8144fffffffffff`
7.0G	`81263ffffffffff`
6.2G	`81277ffffffffff`
5.1G	`812a3ffffffffff`
4.7G	`81447ffffffffff`
4.2G	`8129bffffffffff`
3.6G	`8126bffffffffff`
3.6G	`8148bffffffffff`
3.2G	`81283ffffffffff`
2.8G	`8128bffffffffff`
2.8G	`8148fffffffffff`
2.2G	`812bbffffffffff`
1.9G	`8128fffffffffff`
1.5G	`8127bffffffffff`
1.4G	`812afffffffffff`
1.2G	`8144bffffffffff`
1.1G	`81443ffffffffff`
1.1G	`814cfffffffffff`
521M	`812b3ffffffffff`
409M	`81273ffffffffff`
392M	`8112fffffffffff`
90M	`81293ffffffffff`
85M	`81467ffffffffff`
81M	`810c7ffffffffff`
61M	`815d3ffffffffff`
41M	`810d7ffffffffff`
34M	`81487ffffffffff`
18M	`814f7ffffffffff`
14M	`810c3ffffffffff`
13M	`8112bffffffffff`
12M	`8113bffffffffff`
11M	`810d3ffffffffff`
11M	`810cfffffffffff`
4.3M	`819a3ffffffffff`
1.9M	`810cbffffffffff`
1.2M	`8122fffffffffff`
1.1M	`811d3ffffffffff`
1.1M	`810dbffffffffff`
448K	`819bbffffffffff`
286K	`814e7ffffffffff`
71K	`81227ffffffffff`

I then ran tippecanoe on them, one file at a time as there were RAM usage spikes and I didn't want to suffer any OOM issues. Usually, ~13 GB of RAM was being used on my 64 GB system though this would spike at odd times. The process took 3 weeks to complete.

$ ls 8*.geojson \
    | xargs -P1 \
            -n1 \
            -I% \
            bash -c 'HEXVAL=`echo % | sed "s/.geojson//g"`; tippecanoe --coalesce-densest-as-needed -zg --extend-zooms-if-still-dropping -e fcc_477_$HEXVAL $HEXVAL.geojson'

The process produced 4.6 GB of PBF data across 168,754 files.

I took a 100K record sample (~179 MB in GeoJSON) and ran it through strace and produced a FlameGraph. On an e2-highmem-4 with 4 vCPUs and 32 GB of RAM in GCP's LA zone the following runs in 115 seconds and produces 31.7K PBFs totalling 147 MB in size. This is one PBF for roughly every 3 records.

$ tippecanoe \
    --coalesce-densest-as-needed \
    -zg \
    --extend-zooms-if-still-dropping \
    -e \
    fcc_477 \
    out_100k.geojson

There are 30K write calls, 37K read calls and ~2K openat calls. Around ~96% of the time is waiting on futex.

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 96.61   34.837147      757329        46         3 futex
  1.66    0.600337          20     29388           write
  1.16    0.417305          11     37360           read
  0.32    0.116778          59      1963           close
  0.10    0.035662          18      1965         1 openat
  0.03    0.011153          12       927           fcntl
  0.02    0.008630          18       467           unlink
  0.02    0.006390         114        56           munmap
  0.02    0.006108          11       517           fstat
  0.02    0.005534          11       467           getpid
  0.02    0.005515          50       110           madvise
  0.01    0.004425          41       107           clone
  0.00    0.001751          17        98           mmap
  0.00    0.001383          17        78           brk
  0.00    0.000258          15        17           mprotect
  0.00    0.000223         223         1           execve
  0.00    0.000143          17         8           ftruncate
  0.00    0.000086          86         1           mkdir
  0.00    0.000069           8         8           pread64
  0.00    0.000035          17         2         2 stat
  0.00    0.000035          17         2           getdents64
  0.00    0.000032          10         3           rt_sigaction
  0.00    0.000023          11         2         1 arch_prctl
  0.00    0.000018           8         2           prlimit64
  0.00    0.000017          17         1           sysinfo
  0.00    0.000015          15         1           fstatfs
  0.00    0.000015          14         1         1 access
  0.00    0.000014          14         1           lseek
  0.00    0.000010           9         1           set_tid_address
  0.00    0.000009           9         1           set_robust_list
  0.00    0.000007           7         1           rt_sigprocmask
------ ----------- ----------- --------- --------- ----------------
100.00   36.059126                 73602         8 total

This operation also has around 9.6K context switches and 43K page faults.

             9,664      context-switches          #   81.434 /sec
               105      cpu-migrations            #    0.885 /sec
            43,284      page-faults               #  364.735 /sec

Below is a FlameGraph:

tippe_100k

The main overhead appears to be all the files that need to be written out. If there were fewer PBFs this process should run a lot quicker. It also leads to the small file problem where you end up with a lot of file system overhead from simply having too many files. Is there a way to cut down the number of PBFs being produced?

If I output to a single .mbtiles file it takes substantially longer so I'm not sure if that alone would be an answer for a 60M-record dataset that takes 3 weeks to convert to PBFs.

I don't have much more to report in terms of RAM consumption but if that can be kept down I should be able to run more tippecanoe commands in parallel with one another. The RAM ceiling-to-process ratio is very high at the moment and RAM is the most expensive $/GB piece of hardware on GCP.

mapbox / tippecanoe

RAM exhaustion with 59,067,585-line, 102 GB JSONL file #962