Open thielepaul opened 4 years ago
as hosting OSM data is not a viable option for most users, lightweight alternatives are of interest:
I had a look at many potential solutions and found none of them feasible regarding required compute resources and data quality. If you only want country and region it gets easier and a local reverse lookup should be possible, but then you still want maps to display the location.
The experimental service we're currently providing does not log anything and I think it's already a major step forward not to put your photos on other people's server (cloud). Coordinates don't tell you who was there and when, so it's way less of an issue in practice. An important question is how much users are willing to pay (in hardware & admin time) to avoid that completely.
Would be great if you can have a second look and share your findings with the community!
Using a VPN / Proxy might also solve privacy concerns for those that don't enjoy managing a big stack of local services and keep them up-to-date / secure (edit: although you then need to trust your VPN provider... so it's not really a solution).
One last (long) remark regarding this issue today:
We had a look at other home solutions for personal photo management like Monument and they use Google Maps, although they promise "complete privacy". What mapping solution we're going to ship in the end is unclear, I'm still doing technical evaluation. In general, OSM seems best for reverse lookups, much better than Google.
We stand ready to provide a service that at least doesn't store logs (there are some in memory for debugging during development, yes) and that also gives you a list of previous public events at the given location so that you can automatically create albums of music festivals etc. That is real value and you don't get that otherwise.
Adding a flag to disable reverse lookups is simple. I would still tag my photos though. These reverse lookups are really the last thing I'm worried about given the fact that I'm a Google customer and use Android as well as iOS with Google Maps and a ton of other apps (like the Huawei weather app!) that use my location.
To just display a world map and show where you have been, it's probably easiest to download a static GeoJSON map from https://geojson-maps.ash.ms/ and use Leaflet. They even provide example code to copy & paste.
I've just added a config parameter for you:
cli.StringFlag{
Name: "geocoding-api, g",
Usage: "geocoding api (none, osm or places)",
Value: "places",
EnvVar: "PHOTOPRISM_GEOCODING_API",
},
Note that it's largely untested as I've worked enough for today. Maybe you can test it and give feedback. Thank you!
Thank you for the detailed answer! Regarding self-hostable alternatives, I will hopefully have time in February.
The config parameter works in principle (tested with docker and Wireshark):
-e PHOTOPRISM_GEOCODING_API=places
(photoprism server is used)-e PHOTOPRISM_GEOCODING_API=osm
(osm server is used)-e PHOTOPRISM_GEOCODING_API=none
(no requests to external servers are sent)However, when I set it to none
the import of a photo (with location metadata) is completing with a warning, but then it is not visible in the photos view. (photos without location metadata work fine)
The same behavior is observed if the option is set to places
, but requests to external servers are blocked by a firewall, so maybe this is a problem with error handling in the import process and thus a separate issue.
@thielepaul Thank you for testing this so quickly! I'll take a look at error handling. There are some errors we can ignore when indexing, like this one. Others are fatal, like when we can't read the file at all.
Fixed and refactored, you can test again 👍
Fixed and refactored, you can test again +1
:+1: As far as I tested it, this works as intended now. Thank you!
Just launched the next version of our Places API, see Geocoding for details. It approximates locations using S2 cell IDs and only contacts external services if the internal database fails.
My impression is that this should be good enough in terms of performance and privacy for most users, but I'd love to hear your opinion as you seem to be especially concerned :)
Just launched the next version of our Places API, see Geocoding for details. It approximates locations using S2 cell IDs and only contacts external services if the internal database fails.
My impression is that this should be good enough in terms of performance and privacy for most users, but I'd love to hear your opinion as you seem to be especially concerned :)
I agree that this a good solution for most of the users, thank you!
Anyway, I would like to leave this issue open until I had time to look more into possibilities for offline reverse geocoding. Especially, I want to explore if the same methods that are used for finding the timezone based on the location (https://github.com/evansiroky/timezone-boundary-builder#lookup-libraries) can also be used to get country, state and bigger cities, as this level of accuracy would be sufficient for me (and maybe others).
Sure, we also do this to learn :)
Dustin did something like that and our solution was inspired by it, see https://github.com/photoprism/photoprism/issues/21#issuecomment-568562593
Repositories:
For tiles, you could use https://github.com/maptiler/tileserver-gl and download the tiles from https://openmaptiles.com/downloads/planet/ (~70 GB).
Note that those free tiles are from 2017, the latest version is $1024 for one year plus you can choose from several styles. So we've chosen to pay for hosting as that's much cheaper than paying for tiles and servers.
I also would love to use photoprism without exposing location data to a 3rd party.
I totally get that it is low risk, and for most people the alternative is Google/iCloud anyway, and I understand that you don't log, although that requires a lot of trust in you and your hosting provider.
But I can think of examples where the reverse geocoding could be a privacy leak, for example, over time you can build a list of frequent locations visited by an IP address, based on frequent locations that they take photos, and if the user has automatic sync setup you can also build a list of travel history and possibly make distinctions between home/work..
I see the downloading of map tiles as less of a risk, as you don't know where in the map tile the user is looking, and it's not automatic data that is going to build up a statistical pattern, but opportunistic caching and random downloading of adjacent tiles like Apple Maps does for privacy would help.
Would it be possible to support the OSM Nominatim API, then we could use our own self-hosted geocoder, or at least give us competition on which service we use?
https://geocoder.readthedocs.io/providers/OpenStreetMap.html
Our backend doesn't know when pictures were taken and only works with S2 cell IDs, not the exact coordinates. Also results will be cached locally, so there are no additional requests when other photos were taken at a similar position. Should be safe enough for 99% of users, especially when you're using any other geodata-enabled site / app like Google Maps, Facebook or even Twitter. Much easier to create a profile there than trying to gain information from our privacy-friendly API.
When it comes to supporting other APIs, we found that their metadata models & categorization varies. So it needs additional work. We don't have the resources to do this right now, or even document all the differences in detail so that users can make an informed choice and don't just file bug reports because it doesn't work as expected.
If there would be an easy way for self-hosting, we would have done this. Once we reach our funding goal and have more resources, we will continue to improve and add more options.
Note that there are offline maps already, of course with less detail. We don't own the satellite maps, so we can't give them away to host them locally - even if you had enough storage.
To disable geolocation features completely, set PHOTOPRISM_DISABLE_PLACES
to "true"
.
Fair enough. I understand that it's not a priority, and will try to work around this my making my photoprism instance access the APIs through a VPN. Thanks for responding, the differences in Metadata is something that I'll think about. It's probably out of scope for photo prism and I wonder how other teams are working around it (if at all) for example the microg location providers.
@noaho
over time you can build a list of frequent locations visited by an IP address, based on frequent locations that they take photos, and if the user has automatic sync setup you can also build a list of travel history and possibly make distinctions between home/work.. APIs through a VPN..
With that method you can still be targeted via the same methods you listed above. Very easy to do so if you're just watching one person.
@lastzero Thanks for the privacy advances this last year. On the back burner I think grabbing the openmaptiler files automatically and setting it up automatically is a good idea if you'd want like a "I want full privacy" check box. It's far more than 70GB you posted though. More like 200gb for what most would call basic and 300+ for everything lol
As a user I'd like to use PhotoPrism without cloud dependencies such that no private data is leaked.
Right now PhotoPrism depends on cloud services (openstreetmap.org) for displaying the map and for reverse geocoding. Like that, userdata such as photo coordinates and IP addresses is leaked.
Possible steps to resolve this issue are: