owntracks / talk

Questions, talk about OwnTracks
30 stars 4 forks source link

Enabling reverse geocoding on existing install #181

Closed sehe closed 3 months ago

sehe commented 3 months ago

Hello, I've followed a mix of the docker examples and the quicksetup description to setup up ot-recorder. It works (using MQTT but that's nor important here).

My question is about reverse geocoding. I only noticed after installation that I'd really like that, so I've added --geokey "opencage:my-open-cage-api-key" as well as added the same value in the environment variable OTR_GEOKEY.

After restarting, I cannot see a difference. In the web front-end or when looking in /store/rec/username/devicename/2024-04.rec I still don't see addr being added.

Is there anything else required to do? I have verified with a command line query that opencage is accessible from within the container with my api key.

I'm running docker image owntracks/recorder:latest (9c0e44066875, 8 weeks ago).

jpmens commented 3 months ago

After enabling reverse Geo lookups, the functionality will only work for newly submitted positions, not for existing data in the rec files.

sehe commented 3 months ago

I should have been clearer. I did test with news locations (by manually pushing updates). No addresses show up

jpmens commented 3 months ago

I should also have been clearer, sorry; that's what happens when I tap on a phone just after waking up. ;)

The issue you're likely seeing is you're publishing a location which has already been negatively cached, i.e. yesterday at that location you had no geolocation configured so we saved "None". Today you're getting "None" for that location. ("None" is just a placeholder here.)

If possible, stop the Recorder, delete the SPOOLDIR/ghash/* files, start with ot-recorder --initialize, and then start the Recorder again.

(these steps will NOT delete your past locations, but will remove the cache for geodata and friends [if you have them in HTTP mode]).

Publish a location and you ought to see the reverse geo data in the store, i.e. in the maps.

sehe commented 3 months ago

I figured it would be some stateful interaction like that, I'm glad I asked instead of nuking my data.

Indeed I get some addresses resolved now. Am I right in thinking that the only way to get the correlations is by querying, e.g. using ocat?

E.g. I did

docker exec owntracks-recorder ocat --user sehe --device starlte --from "2024-01-01 00:00" | jq '.locations[].addr' | sort | uniq -c
     90 "Street Sanitized 107, 1111 HN Sanitized, Netherlands"
    157 null

I assume I have to do some tuning to get addresses to resolve for my work locations which feature in the 147 records, but I reckon I have all the information I need to get that to work, thanks!

jpmens commented 3 months ago

Glad you got it working!

If you can visit your work locations, the issue will fix itself. :-) Otherwise, if you're willing to fiddle and hack, what you can do is the following:

  1. obtain lat, lon from the existing records in the STORAGE/rec/user/device/YYYY-MM.rec files, and calculate a geohash of precision 7 (our docker images are configured with GHASHPREC = 7). If you use Python there are several modules which'll do that for you.
  2. run ocat --dump to see how we cache the data in the lmdb database
  3. Perform queries to the OpenCage service for your lat/lons, and produce JSON as in step 2
  4. when you're ready, load the full dataset with yourscript | ocat --load. The key is the geohash, and the content is the JSON (no newlines, please). load expects <key><space><content>.

Good thing it's a weekend, eh? ;-)

sehe commented 3 months ago

Always willing to fiddle and hack (if the result is me being in control of my data)

I extracted previously visited locations, grouping for ghash picking the most prevalent coordinates if duplicated:

docker exec owntracks-recorder ocat \
    --user sehe \
    --device starlte \
    --from "2024-01-01 00:00" \
    | jq -r '.locations[] | (.lat|tostring) + "\t" + (.lon|tostring) + "\t" + (.ghash) '  \
    | sort -k3 | uniq -c | sort -rn | sort -uk4,4 \
    > ranked.txt

(Using the CSV output format doesn't really make it much more elegant due to the default delimiter not being what sort/uniq expect/support)

ranked.txt now contains 59 lines like (picking one for privacy purposes)

  2 52.1348495      4.4533339       u170t6f

Then I do the opencage reverse geocoding calls:

 time tac ranked.txt |
     while read N lat lon key
     do
         json="$(curl -s "https://api.opencagedata.com/geocode/v1/json?key=myopencagekey&q=$lat%2C+$lon&pretty=1&no_annotations=1&no_record=1&limit=1" \
            | jq -c '{ cc: .results[0].components."ISO_3166-1_alpha-2", addr: .results[0].formatted, tst: .timestamp.created_unix, tzname: "GMT" }')"
         echo "$key $json"
     done | tee toload.txt

Which results in 59 lines like this:

u170t6f {"cc":"NL","addr":"Mozartlaan 29, 2253 HW Voorschoten, Netherlands","tst":1712342751,"tzname":"GMT"}

None of my attempts to load these seem to have any effect:

docker exec owntracks-recorder ocat --load < toload.txt
docker exec owntracks-recorder ocat --load ghash < toload.txt

I've also edited toload.txt to include the surrounding context previously dumped e.g.

friends                                                
keys                                                
luadb                                                
topic2tid                                                
<!-- previously dumped lines included -->
u170t6f {"cc":"NL","addr":"Mozartlaan 29, 2253 HW Voorschoten, Netherlands","tst":1712342751,"tzname":"GMT"}
wp                                                

I made sure to check that the "hidden" whitespace and line-endings were preserved. I also made sure to sort the ghash lines by key under C locale-collation (:5,$-1!LANG=C sort in vim).

The only thing I can come up with is that --load silently fails because the database is in use. However, I have not been able to come up with an invocation that solves that problem. It doesn't seem possible to docker exec ... sh inside the container, or to start it with an alternative entry point, or maybe my docker fu is failing me here?

jpmens commented 3 months ago

Sorting of keys is unnecessary as is preserving the previous context. I assume you're right in that the lmdb database is in use in the docker container. Sadly, I'm not a docker expert in any way. The only thing you might want to try is what I have from old notes:

docker run -it --rm  --entrypoint /bin/sh -e OTR_PORT=0 owntracks/recorder:latest

IIRC --entrypoint was the way to override the configured entrypoint.

sehe commented 3 months ago

That fixed it! I was actually confusing myself because my docker run commandline had a stray -d which is short for --detach which I don't normally use, so I missed it :)

Thanks for the help!