ellenhp / airmail

Lightweight geocoder in pure Rust
https://airmail.rs/
Apache License 2.0
289 stars 2 forks source link

Cannot execute index step (BUILD.md) via Podman #21

Open cjacky475 opened 1 week ago

cjacky475 commented 1 week ago

Hello,

I was following BUILD.md and the last step before starting the service was to execute:

chcon -t container_file_t ./data/whosonfirst-data-admin-us-latest.spatial.db; podman run --security-opt label=disable --net=host -v /run/user/1000/podman/podman.sock:/run/podman/podman.sock:z -v ./data:/var/airmail/data:Z -v ./index:/var/airmail/index:Z --rm airmail_build airmail_import_osm --wof-db $PWD/data/whosonfirst-data-admin-us-latest.spatial.db --index /var/airmail/index --admin-cache /var/airmail/data/admin-cache --osmx /var/airmail/data/Seattle.osmx --docker-socket /run/podman/podman.sock --recreate

Since chcon is not available on Windows (10), I modified to:

podman run --security-opt label=disable --net=host -v /run/user/1000/podman/podman.sock:/run/podman/podman.sock:z -v ./data:/var/airmail/data:Z -v ./index:/var/airmail/index:Z --rm airmail_builder airmail_import_osm --wof-db data/whosonfirst-data-admin-an-latest.spatial.db --index /var/airmail/index --admin-cache /var/airmail/data/admin-cache --osmx /var/airmail/data/antarctica-latest.osmx --docker-socket /run/podman/podman.sock --recreate

I've built all Dockerfiles successfully, images and the folders where generated and filled with necessary data:

image

image

However, I get an error:

chcon: can't apply partial context to unlabeled file 'data/whosonfirst-data-admin-an-latest.spatial.db'
Stopping container `airmail-pip-service-0`
Creating container `airmail-pip-service-0`
thread 'main' panicked at /usr/src/airmail/airmail_indexer/src/lib.rs:285:18:
Failed to start spatial server container.: DockerResponseServerError { status_code: 404, message: "no such image: docker.io/library/spatial_custom: image not known" }
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
ellenhp commented 1 week ago

(replace all podman instances with docker if you're using docker instead)

Can you execute podman images on the host? If you executed podman build -f spatial/Dockerfile -t spatial_custom step you should have a spatial_custom image. If you don't have that listed, that's probably the issue. If it is listed, you're probably not forwarding the podman socket to the container successfully.

The /run/user/1000/podman/podman.sock path in this command is what I use for userspace podman on my Fedora box, but you might need to change it for WSL. You should be able to ls -l /run/user/1000/podman/podman.sock and see the socket. If that doesn't show up, see if you can find out where your docker/podman socket is then use that instead of the location I provided. If podman images shows a spatial_custom image and the ls command works, this is definitely a bug on my end.

What I'm doing with that command is very rudimentary docker-outside-of-docker, also known as docker-from-docker. If you can find a tutorial for doing that in WSL that might help modify the command enough to get it working but I don't have a windows box so I don't know if I can be much more help than this.

cjacky475 commented 1 week ago

(replace all podman instances with docker if you're using docker instead)

This time I am using podman.

Can you execute podman images on the host?

Executing podman images on the host lists all the images I built:

image

You should be able to ls -l /run/user/1000/podman/podman.sock and see the socket. If that doesn't show up, see if you can find out where your docker/podman socket is then use that instead of the location I provided.

I've read that Windows uses named pipes for communication between Podman client and service, so I don't need to specify anything? Not specifying that and running follows with the error (:

Failed to start spatial server container.: HyperResponseError { err: hyper::Error(Connect, Os { code: 2, kind: NotFound, message: "No such file or directory" }) }

I have installed both Podman and Docker. Now I tried running the same command on Docker, but I could not mount ./data so I had to mount., also removed the mounting /run/user/1000/podman/podman.sock since it's not accessible on Windows, so modified command:

docker run --security-opt label=disable --net=host -v .:/var/airmail:Z -v .:/var/airmail:Z --rm airmail_builder airmail_import_osm --wof-db data/whosonfirst-data-admin-an-latest.spatial.db --index /var/airmail/index --admin-cache /var/airmail/data/admin-cache --osmx /var/airmail/data/antarctica-latest.osmx --recreate

But I still get the same error:

chcon: can't apply partial context to unlabeled file 'data/whosonfirst-data-admin-an-latest.spatial.db'
thread 'main' panicked at /usr/src/airmail/airmail_indexer/src/lib.rs:285:18:
Failed to start spatial server container.: HyperResponseError { err: hyper::Error(Connect, Os { code: 2, kind: NotFound, message: "No such file or directory" }) }
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrac

I am really not sure what else I can do.

cjacky475 commented 1 week ago

I might add my final command and the stacktrace if this helps in any way. I hope you can still provide some guidance on what I could do.

docker run --security-opt label=disable --net=host -v ${PWD}\data:/var/airmail/data -v ${PWD}\index:/var/airmail/index --rm -e RUST_BACKTRACE=1 airmail_builder airmail_import_osm --wof-db data/whosonfirst-data-admin-lt-latest.spatial.db --index /var/airmail/index --admin-cache /var/airmail/data/admin-cache --osmx /var/airmail/data/antarctica-latest.osmx --recreate
chcon: can't apply partial context to unlabeled file 'data/whosonfirst-data-admin-an-latest.spatial.db'
thread 'main' panicked at /usr/src/airmail/airmail_indexer/src/lib.rs:285:18:
Failed to start spatial server container.: HyperResponseError { err: hyper::Error(Connect, Os { code: 2, kind: NotFound, message: "No such file or directory" }) }
stack backtrace:
   0: rust_begin_unwind
             at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:645:5
   1: core::panicking::panic_fmt
             at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/panicking.rs:72:14
   2: core::result::unwrap_failed
             at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/result.rs:1649:5
   3: core::result::Result<T,E>::expect
             at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/result.rs:1030:23
   4: airmail_indexer::ImporterBuilder::build::{{closure}}
             at /usr/src/airmail/airmail_indexer/src/lib.rs:283:13
   5: airmail_import_osm::main::{{closure}}
             at /usr/src/airmail/airmail_import_osm/src/main.rs:59:25
   6: tokio::runtime::park::CachedParkThread::block_on::{{closure}}
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/park.rs:281:63
   7: tokio::runtime::coop::with_budget
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/coop.rs:107:5
   8: tokio::runtime::coop::budget
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/coop.rs:73:5
   9: tokio::runtime::park::CachedParkThread::block_on
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/park.rs:281:31
  10: tokio::runtime::context::blocking::BlockingRegionGuard::block_on
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/context/blocking.rs:66:9
  11: tokio::runtime::scheduler::multi_thread::MultiThread::block_on::{{closure}}
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/scheduler/multi_thread/mod.rs:87:13
  12: tokio::runtime::context::runtime::enter_runtime
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/context/runtime.rs:65:16
  13: tokio::runtime::scheduler::multi_thread::MultiThread::block_on
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/scheduler/multi_thread/mod.rs:86:9
  14: tokio::runtime::runtime::Runtime::block_on
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/runtime.rs:349:50
  15: airmail_import_osm::main
             at /usr/src/airmail/airmail_import_osm/src/main.rs:77:5
  16: core::ops::function::FnOnce::call_once
             at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/ops/function.rs:250:5
cjacky475 commented 1 week ago

I've made some progress by adding: -v /var/run/docker.sock:/var/run/docker.sock

Now all I see in the log is :

Retrying to populate admin areas.
Failed to populate admin areas. error sending request for url (http://localhost:3102/query/pip?lon=-66.99934646343361&lat=-68.18327265248827): error trying to connect: tcp connect error: Connection refused (os error 111)        
Failed to populate admin areas after 5 attempts. Skipping POI.

But nothing started to run on port 3102, I see airmail_builder (which spams the log) and spatial_custom started. The spatial_custom logs:

info: [geometry] load: /mnt/whosonfirst/whosonfirst-spatialite.db
info: [geometry] load: /mnt/whosonfirst/whosonfirst-spatialite.db
info: [geometry] load: /mnt/whosonfirst/whosonfirst-spatialite.db
info: [geometry] load: /mnt/whosonfirst/whosonfirst-spatialite.db
info: [geometry] load: /mnt/whosonfirst/whosonfirst-spatialite.db
[master] using 4 cpus
[master] worker forked 21
[master] worker forked 26
[master] worker forked 28
[master] worker forked 35
[worker 28] listening on 0.0.0.0:3000
[worker 21] listening on 0.0.0.0:3000
[worker 35] listening on 0.0.0.0:3000
[worker 26] listening on 0.0.0.0:3000
ellenhp commented 1 week ago

I don't know what's going on. :/ There should be a port-forward from 3102 to 3000. If you can't hit from the host that means the port forward isn't working.

https://github.com/ellenhp/airmail/blob/707880d7f0e19b3c9b43fab53245ea82ff0d0fb3/airmail_indexer/src/lib.rs#L144

Can you try the Option 1 steps? Docker-from-docker is a weird edge case already and on WSL I have no idea how to debug it. You should be able to get cargo with rustup and run the spatial-custom container while staying on the WSL happy path. I guess when I wrote that Option 2 will work better on weird platforms I was thinking of Linux systems where some of the dependencies may have been hard to get, or the distro package manager gives you old versions or something. If you can get the dependencies it's always going to be easier to just build the index on the host.

cjacky475 commented 1 week ago

I installed Debian on my other laptop (I could only found 2GB USB flash drive, Fedora weights a bit more), did everything for the Option 1 steps, but the last command with cargo for indexing keeps throwing me that chcon cannot change to context system_u:object_r:container_file_t:s0, also it could not start spatial server container, since it was trying to container create: creating named volume <...> setting selinux label for <...> to system_u:object_r:container_file_t:s0: invalid argument.

I've read that this context named container_file_t is only available on Fedora and Red-hat. So the last option I have is to install Fedora (as you have) and try these steps again? I am not sure I understand these contexts for container_file_t and why they're only available on Fedora.

P.S: As I understood this will create index files under index directory, later I'll be able to run airmail-service anywhere by providing the index folder, not only on fedora? Also, does this allow reverse geocoding, not only geocoding?

Thank you for the help. I hope I can run this.

cjacky475 commented 1 week ago

I've installed Fedora. Followed all steps again until indexing via cargo CMD. It runs for some time until exactly 5510000 mil. POIs have been parsed: 5510000 POIs parsed in 2676 seconds, 2058 per second.

And then logs start to show: Failed to populate admin areas. error sending request for url (http://localhost:3102/query/pip?lon=<...>&lat=<...>): connection closed before message completed)

Sometimes it says not connection closed before message completed, but connection reset by peer (os error 104).

I've tried several times with the same results. I am using south-america-latest.osm.pbf and whosonfirst-data-admin-cl-latest.spatial.db (Chile). Is this something wrong withe the files or what's happening?

P.S: Trying the URL in browser which is shown as failed in logs works fine - 200 result, empty JSON.

ellenhp commented 3 days ago

I'd expect empty JSON for lat/lng pairs outside of Chile if that's the WOF extract you're using. These warnings are also normal. It's hitting the spatial server with thousands of requests per second so a few dropped queries here and there is expected. If I remember right, I retry on failure so I think this should be fine. Have you tried running it to completion and looking into the index directory? Or does it crash?

cjacky475 commented 3 hours ago

It does not crash, it just spams that messages every some seconds. It was running on my laptop almost for 2 days without finishing, not sure how long remains or what's happening. I can't imagine how long will it take for the whole world to index. Maybe I need to build this on VPS, because it's taking forever on my PC, not sure. I will see what I can do, if anything.