peermaps / ingest

Convert osm pbf files into a format that can be used in peermaps.
15 stars 1 forks source link

use more cores #13

Closed ghost closed 3 years ago

ghost commented 3 years ago

The memory footprint is stable and low but the ingest is only using about 2 cores out of 10 on the vps for the first phase. It could be that this is as fast as a single leveldb will go. I don't think the machine is reaching IO saturation since after 23 hours the leveldb dir is only 111GB.

Some options to explore for the first phase:

The second phase could use some of the same tricks and there are many places in eyros where async operations happen serially instead of in parallel.

The osm2pgsql page states that they can process planet-osm in about a half a day so we have some room for improvement although the peermaps ingest uses far less memory already (osm2pgsql requires a minimum of 64GB ram).

ghost commented 3 years ago

The second option for the first phase of partitioning the keyspace would have the benefit of being able to farm out the work across a cluster of volunteer systems, which is a long-term goal.

ghost commented 3 years ago

The new scanning approach is much better at using multiple cores when it can and the osm decoding happens in parallel at pre-calculated offsets.