Closed longears closed 11 years ago
EXIF should be handled carefully since it can contain location and time information that users may not realize is saved in their pictures, so there are potentially severe privacy implications. http://www.androidcentral.com/geotagging
I know Tumblr filters out EXIF time and geotagging data (though leaves some camera-geek details like exposure time, ISO, aperture intact). Other social networks are probably doing likewise.
Why encode the EXIF as JSON at all? It seems to me that date taken and location are the most useful/popular fields. Those, with user permission, could be included in the JSON (which would allow them to be added manually if they aren't in the EXIF, for example in the case of a point and shoot camera without GPS). The rest could be left in the EXIF for clients to use if they want.
Your average user does not understand EXIF but they do understand the "Do you want to share where these photos were taken?" Let's just make we ask and follow through.
@mwanji For people who own nice cameras like digital SLRs, stuff like aperture and ISO is very much of interest. Tumblr, which is very visual-focused, has a slightly hidden feature that shows you that stuff in their web app if it was present in the EXIF.
My main point here though is that photo-uploading clients should actively strip the GPS location, and possibly time taken, info from EXIF unless the user has explicitly asked to leave it in. Otherwise, it's an unexpected reveal that can get people in trouble.
But this is also worth thinking about from a spec point of view. What if the user wants to archive that location info for their own use, while not showing it to anyone else?
That could be achieved by stripping the geotag from the EXIF, but adding it as a "location" in the JSON metadata, and then adding fine-grained permissions for that location data, separate from the permissions on the photo.
Then the same solution could be used for other types of posts that have geolocation info — for example, a status post made from a smartphone.
Maybe this should be taken to the dev list (tent.dev@librelist.com).
@graue I think we're in agreement: put commonly used properties in JSON and leave the rest in EXIF. The fine-grained permission for sensitive data seems like a good idea.
Here's a proposal which I also cross-posted to the dev list (tent.dev@librelist.com).
Location and time are good EXIF properties to promote to JSON. Keywords are a runner-up; these can be mapped to Tent tags instead of adding a new field.
There's also a new source
field to be used for content attribution. Tumblr added this a while ago and it greatly reduced the problem of images being widely reblogged without attribution or links to their authors.
Proposed changes to Photo type
No change:
caption
, albums
, tags
Remove fields:
exif
Add fields:
source
: Optional. A URL pointing to the author of the image for attribution purposes. This could be a Tent entity or an external website. A browser extension which reposts images from the internet should set this to the URL of the page that was displaying the image. Apps displaying this post are strongly encouraged to use this field if it's present.
location
: Optional. A GeoJson point. Displaying apps should either ignore location data embedded in image metadata or use it as a backup when this Tent location field is missing.
captured_at
: Optional. Unix time. The recommended order of fields to try, best first, and their timezone details:
XMP:DateTimeOriginal
: includes a timezoneXMP:CreateDate
: includes a timezoneIPTC:DateCreated
+ IPTC:TimeCreated
: includes a timezoneEXIF:DateTimeOriginal
: in localtime with no timezone specified, so use the user's current timezone.EXIF:CreateDate
: in localtime with no timezone specified, so use the user's current timezone.If the file doesn't have an embedded capture time, omit the captured_at
field. Do not use the file's last-modified timestamp.
Privacy best practices for photo uploading apps:
tags
field. The default choice should be to strip the keywords from the image file. Keywords are used in programs like Lightroom for recording the names of people and places in the photo, so they can be a privacy risk.Resources:
This proposal is excellent. You hit all the important points regarding metadata preservation, privacy and content attribution. I'm especially happy to see details like, in the "location" field, "Displaying apps should either ignore location data embedded in image metadata or use it as a backup when this Tent location field is missing." Also, great idea to have a source URL for attribution.
The one thing overlooked here is fine-grained permission for metadata. I might want a photo to be public, but the location
to only display to me and a small group of friends. And one use case I think the Tent devs have in mind is a backup/archiving system. In which case, I want all my EXIF tags preserved, but automatically stripped out when serving the image to the public or to another user. It would be nice to accomplish that without having to upload two copies of the image (EXIF and no-EXIF).
However, your spec looks plenty good enough for a start and we could worry about fine-grained permissions later. Daniel, Jonathan, Jesse, et al., thoughts?
This looks like a great start.
Does someone want to make a pull request to tent-schemas with the updated info?
More work on the photo type has been done in #183, but we still need to define the exif data formats.
EXIF is a binary format and I can't find a standard way to encode it as JSON.
An easy reference point could be to use the output of
exiftool myfile.jpg -json
as the standard format. This would have the side benefit of allowing XMP and IPTC information as well.More info: