hyperknot / openfreemap

Free and open-source map hosting solution with custom styles for websites and apps, using OpenStreetMap data
https://openfreemap.org/
Other
2.91k stars 60 forks source link

Questions about self-hosting requirements #32

Open joao opened 1 month ago

joao commented 1 month ago

Hi,

First of all, great project @hyperknot! :)

I'm considering self-hosting an instance on a storage VPS, but after going through the docs and benchmarks, have a few questions about requirements if you don't mind:

Thank you!

hyperknot commented 1 month ago

There is nothing about 4 GB in particular, I just wasn't sure what can a 1 or 2 GB server handle these days. If you can make it run on a 1 GB or a 2 GB machine, please let me know. Maybe I'll just change it to 1 GB.

The HDD should be perfectly fine, it's just I tried it on NAS HDD and it took like a day to simply uncompress this file. But I think there might have been something else on that VPS, as it's crazy that it took that long.

So in summary, I think this should work fine even on a Raspberry Pi with a USB HDD, but I haven't tried it.

hyperknot commented 1 month ago

I updated the docs

joao commented 1 month ago

Thank you! Managed to install it on a VPS with a big enough SSD, now just need the issue #28 to be resolved.
Any way to fix it manually in the meantime?

Also something that happened to me, that I can make a pull request if you agree.
With the default code and env settings, I was having issues connecting via SSH. So I added USER and PORT to the .env file, and this two extra lines in init-server.py:

def get_connection(hostname, user, port):  
    ssh_passwd = dotenv_val('SSH_PASSWD')  
    user = dotenv_val('USER'). 
    port = dotenv_val('PORT')

I had root and port 22as default on my server, but it wasn't catching them somehow, on macOS.

joao commented 1 month ago

Managed to change it manually with success.
Perhaps a solution would be to have something like this, in the ssh_lib/tasks.py file:

def update_tiles_host(c):
    domain_le = dotenv_val('DOMAIN_LE')

    print(f"Updating tile URLs to {domain_le}")

    styles_dir = '/data/ofm/http_host/assets/styles/ofm/'

    # List all JSON files in the directory
    result = c.run(f'ls {styles_dir}*.json', hide=True)
    json_files = result.stdout.strip().split('\n')

    for json_file in json_files:
        # Read the file content
        cat_result = c.run(f'cat {json_file}', hide=True)
        content = cat_result.stdout

        # Replace the tile URL
        updated_content = content.replace('tiles.openfreemap.org', domain_le)

        # Write the updated content back to the file
        put_str(c, json_file, updated_content)

    print("Tile URLs updated successfully.")
hyperknot commented 1 month ago

I'm fixing it properly now. Can you post the hardware specs of the server, just out of interest? How much RAM was needed?

joao commented 1 month ago

I'm running it successfully on a €4.5 Contabo Storage VPS.
It has 3GB of RAM, but it's only using 175MB so far. Didn't pay close attention to how much it requires during setup.

I would like to run the benchmark you have on the repository, but it isn't uploaded during setup and I haven't been able to run it. Have tried to run first create_path_list.py, but with no success. Can you give any pointers?

Will do the setup again after the tiles endpoint fix, to pay close attention to the RAM needs on that stage.

hyperknot commented 1 month ago

Thanks for pointing out the Contabo VPS, it's an amazing deal, I've added a note about it to the self-hosting readme.

Yes, the path list is made from a real-world tile server, and back then I couldn't find a way to de-anonymize server logs, so that's missing for you. Now there is so much random traffic, that I think I could share a sample of the real-world server log.

joao commented 1 month ago

Managed to adapt the benchmark script to run and also wrote a script to generate 500k random url strings (not sequential), for .pbf files.

From my machine in Portugal, on a 1Gbps/200Mbps connection, running:
wrk -c100 -t25 -d60s -s modules/http_host/benchmark/wrk_custom_list.lua __host__

With an nginx cache reset before, it gets:

25 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    47.52ms    5.37ms 297.38ms   94.26%
    Req/Sec    84.15      9.85   121.00     83.11%
  132052 requests in 1.00m, 115.51MB read
Requests/sec:   2197.29
Transfer/sec:      1.92MB

Not sure if this is 'scientific accurate', but performance seems good. Also kept an eye with htop on the VPS, that has 2 CPU threads, it was 20% on average between both, RAM usage at ~250MB.

Haven't been able to run it on the host, not sure why. Maybe localhost or domain are 'not open'.

hyperknot commented 4 weeks ago

It's important to run this on localhost, not over full network, otherwise you are just testing your internet speed. I've reworked the whole benchmarking docs. Can you check the updated docs, and posts why it doesn't work?

joao commented 4 weeks ago

Managed to run the benchmark on localhost.

First, http://localhost didn't work, I had to add localhost to the data/nginx/sites/ofm_le.conf file, in the server_name along with the domain I'm using.

Second, in the wrk_custom_list.lua file changed the following, to include the localhost:
local url_base = "http://localhost/planet/20241022_231001_pt/"

And to speed up, passed the url_base of the first loop instead to the return of the getNextUrl() function: return url_base .. url_path

Results, of running wrk -c10 -t4 -d60s -s /data/ofm/benchmark/wrk_custom_list.lua http://localhost:

Running 1m test @ http://localhost
  4 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     3.35ms    7.08ms 249.75ms   90.12%
    Req/Sec   309.22    468.73     5.95k    94.74%
  74986 requests in 1.00m, 66.25MB read
Requests/sec:   1247.98
Transfer/sec:      1.10MB

Should test it with a real world sample if you can provide one, as I don't have yet real usage server logs.

hyperknot commented 4 weeks ago

Yes, 1200 reqs doing 1 MB is mostly just ocean tiles, probably not relevant to real world. Once there is some realistic load on OpenFreeMap it'll be possible to have anonymised logs, but currently it's not possible while keeping user's privacy.

Your best idea would be to just scroll around the map and record the tiles.