osm-search / Nominatim

Open Source search based on OpenStreetMap data
https://nominatim.org
GNU General Public License v3.0
3.05k stars 712 forks source link

Space requirement for Netherlands country extract import? #140

Closed tommedema closed 10 years ago

tommedema commented 10 years ago

I'm considering what kind of dedicated or virtual server to purchase. For my purpose I only need to import the 900MB Netherlands PBF file from geoextract.

However, since I want to get an SSD, I cannot get too much disk space. Therefore I was wondering if anyone knows what amount I need to store the PBF file and import into into postgresql using Nominatim?

Thanks

mtmail commented 10 years ago

You can try a 2core/4GB RAM/60GB SSD machine on https://www.digitalocean.com/pricing/. It costs $1.44/day so even if the import takes 3 days you've spent less than $10 USD on testing. Of course you can take the 2GB machine and run a smaller country first for testing at 3 cent/hour.

One thing to note is that these machines don't have swap space by default. You can add that easily. https://www.digitalocean.com/community/articles/how-to-add-swap-on-ubuntu-12-04

tommedema commented 10 years ago

So your guess is that 60GB is sufficient? How did you come up with this number? :) Thanks again.

mtmail commented 10 years ago

I have some old notes that Germany fit into 63GB and Netherlands input data is half the size of the Germany input data (*.osm.pbf file). If you run out of space during the index stage of the import then it's easy to upgrade the digitialocean machine (takes 2 minutes) and run that stage again.

lonvia commented 10 years ago

Netherlands is 1/25th of the planet, so 40GB is a good guess. The extract is small enough that you can easily run a test install on your local laptop and find it out, shouldn't take more than half a day to import.

The usual advice for Nominatim servers is to use bare metal. IO performance of virtual servers is just not sufficient, although this might have changed for those SSD-powered servers. Also, IO is not that relevant for such a tiny extract.