onthegomap / planetiler

Flexible tool to build planet-scale vector tilesets from OpenStreetMap data fast
Apache License 2.0
1.38k stars 110 forks source link

I am encountering a memory leak issue. #946

Closed CrazyBug-11 closed 2 months ago

CrazyBug-11 commented 3 months ago

I have configured the read and write threads as follows:

threads = 8 process_threads = 6 Currently, I am reading a 250MB dataset containing 4.88 million point features. Below are my JVM settings:

-Xms8g -Xmx8g -Xmn4g -XX:MaxDirectMemorySize=4g -XX:+UseG1GC I don't understand why reading a 200MB dataset would cause ‘Java heap space’’. Are there any parameters I can adjust to avoid memory leaks? I prefer slower processing over causing memory leaks.

My Questions: Why does reading a 200MB dataset cause a memory leak? Are there any parameters that can be adjusted to avoid memory leaks? Expectation:

I prefer slower processing to avoid memory leaks.

image

msbarry commented 3 months ago

It's running out of memory when generating the first tile during the "archive write" phase but it hasn't written out any tiles yet, so that makes it sound like it gets stuck on the z0 tile.

I'm guessing your profile is now setting a very low minzoom (0?) and min size of 0 for all the buildings. That means to generate z0 it needs to pull all 4.8m building into memory. What post-processing did you end up with at low zooms? You might need to do some sort of sampling so minzoom 0 won't get set on as many buildings. If you merge overlapping or nearby polygons then a representative sample would look almost exactly the same.

Also in general I recommend -XX:+UseParallelGC which is tuned for batch jobs like planetiler where throughput matters more than latency but in this case I don't think it would make a difference.

CrazyBug-11 commented 3 months ago

@msbarry Here is my PlanetilerConfig configuration information:

{
    "append": false,
    "arguments": {
        "stats": {}
    },
    "bounds": {
        "world": false
    },
    "compressTempStorage": true,
    "debugUrlPattern": "https://onthegomap.github.io/planetiler-demo/#{z}/{lat}/{lon}",
    "downloadChunkSizeMB": 100,
    "downloadMaxBandwidth": 0.0,
    "downloadThreads": 1,
    "featureProcessThreads": 4,
    "featureReadThreads": 1,
    "featureSourceIdMultiplier": 10,
    "featureWriteThreads": 1,
    "force": true,
    "httpRetries": 1,
    "httpRetryWait": {
        "nano": 0,
        "negative": false,
        "positive": true,
        "seconds": 5,
        "units": [
            "SECONDS",
            "NANOS"
        ],
        "zero": false
    },
    "httpTimeout": {
        "nano": 0,
        "negative": false,
        "positive": true,
        "seconds": 30,
        "units": [
            "SECONDS",
            "NANOS"
        ],
        "zero": false
    },
    "httpUserAgent": "Planetiler Downloader (https://github.com/onthegomap/planetiler)",
    "keepUnzippedSources": false,
    "logInterval": {
        "nano": 0,
        "negative": false,
        "positive": true,
        "seconds": 10,
        "units": [
            "SECONDS",
            "NANOS"
        ],
        "zero": false
    },
    "logJtsExceptions": false,
    "martinUrl": "http://localhost:9545",
    "maxPointBuffer": null,
    "maxzoom": 14,
    "maxzoomForRendering": 14,
    "minFeatureSizeAtMaxZoom": 0.0625,
    "minFeatureSizeBelowMaxZoom": 1.0,
    "minioUtils": {
        "bucketName": "linespace",
        "endpoint": "http://123.139.158.75:9325"
    },
    "minzoom": 0,
    "mmapTempStorage": true,
    "multipolygonGeometryMadvise": true,
    "multipolygonGeometryStorage": "mmap",
    "nodeMapMadvise": true,
    "nodeMapStorage": "mmap",
    "nodeMapType": "sparsearray",
    "osmLazyReads": true,
    "outputLayerStats": false,
    "outputType": "mbtiles",
    "simplifyToleranceAtMaxZoom": 0.0625,
    "simplifyToleranceBelowMaxZoom": 0.1,
    "skipFilledTiles": false,
    "sortMaxReaders": 6,
    "sortMaxWriters": 6,
    "threads": 8,
    "tileCompression": 1,
    "tileWarningSizeBytes": 1048576,
    "tileWeights": "D:\\Project\\Java\\server-code\\target\\classes\\planetiler\\tile_weights.tsv.gz",
    "tileWriteThreads": 1,
    "tmpDir": "E:\\LineSpaceData\\china-latest-points"
}

Do you mean that it is generally not recommended to generate minzoom (0)? Here is my code: Hmm, according to what you said, could it be that feature.setMinPixelSize(0); is causing the issue?

Additionally, I recorded source.tags() in feature::setAttr. Could this have an impact?

private static void startSlice(PlanetilerTransDTO param) {
    Path[] routes = param.getInputPaths().stream().map(Paths::get).toArray(Path[]::new);
    Envelope bounds = calculateBounds(routes);
    List<String> list = handlePlanetilerParam(param, bounds);
    Planetiler.create(Arguments.fromArgs(list.toArray(new String[0])))
        .setProfile(new Profile.NullProfile() {
            @Override
            public void processFeature(SourceFeature source, FeatureCollector features) {
                ParquetFeature parquetFeature = (ParquetFeature) source;
                Path path = parquetFeature.path();
                String layerName = FileUtil.mainName(path.toFile());
                if (!layerMap.containsKey(layerName)) {
                    String geometryColumn = parquetFeature.geometryColumn();
                    String geometryType = parquetFeature.geometryType().name();
                    layerMap.putIfAbsent(layerName, Arrays.asList(geometryColumn, geometryType));
                }
                FeatureCollector.Feature feature;
                if (source.canBePolygon()) {
                    feature = features.polygon(layerName).setZoomRange(param.getMinZoom(), param.getMaxZoom());
                } else if (source.canBeLine()) {
                    feature = features.line(layerName).setZoomRange(param.getMinZoom(), param.getMaxZoom());
                } else if (source.isPoint()) {
                    feature = features.point(layerName).setMaxZoom(param.getMaxZoom());
                } else {
                    throw new LinespaceServiceException("unknown feature type");
                }
                feature.setMinPixelSize(0);
                source.tags().forEach(feature::setAttr);
            }

            @Override
            public String name() {
                return "LINESAPCE_" + (StringUtils.isNotBlank(param.getOutputType()) ? StringPool.EMPTY : param.getOutputType().toUpperCase());
            }
        })
        .addParquetSource("parquet", List.of(routes))
        .setOutput(Paths.get(param.getTempOutputPath() + "/" + (StringUtils.isBlank(param.getServiceName()) ? "out" : param.getServiceName()) + ".mbtiles"))
        .run();
}
msbarry commented 3 months ago

Generating z0 is fine, you just need to limit the features going into it. If there isn't a way to decide this based on the feature attributes you could use something like minzoom=log4(random(4^14)) to have roughly equal Number of features per zoom or lower to 4^10 or less for more density. You could use feature id %4^10 to make it deterministic or something using Long.numberOfLeadingZeroes for a faster log alternative. (cc/ @bdon for overture buildings)

Also you can use features.anyGeometry() in place of the if line, if polygon ... check

CrazyBug-11 commented 3 months ago

Generating z0 is fine, you just need to limit the features going into it. If there isn't a way to decide this based on the feature attributes you could use something like minzoom=log4(random(4^14)) to have roughly equal Number of features per zoom or lower to 4^10 or less for more density. You could use feature id %4^10 to make it deterministic or something using Long.numberOfLeadingZeroes for a faster log alternative. (cc/ @bdon for overture buildings)

By using the method you recommended, I can indeed solve my current file issue. However, this may require adjusting random(4^14) based on different datasets to prevent the elements in low-level zoom tiles from being too few. Additionally,When cutting to high-level zoom, memory leaks may also occur.

I'm curious. I downloaded my data from . When you cut data, do you encounter this issue? How do you resolve it?