darktable-org / darktable

darktable is an open source photography workflow application and raw developer
https://www.darktable.org
GNU General Public License v3.0
9.52k stars 1.12k forks source link

Export - XMP data not removed - mwg-rs:Regions #17355

Open oj-random opened 3 weeks ago

oj-random commented 3 weeks ago

Describe the bug

A picture contains information about faces as:

Please see attached Brigitte_Bardot_in_Rome_1969.jpg.

After exporting the picture... The tags mwg-rs:Regions are still contained in the picture although no metadata should be exported, please see attached files

Steps to reproduce

  1. Import Brigitte_Bardot_in_Rome_1969.jpg into Darktable
  2. Open the export settings
  3. Uncheck all boxes (exif, xmp, location,...)
  4. Export the picture
  5. View the metadata contained in the file (using Eye of Gnome or Exiftool or ...), see attached screenshot
  6. Result: lr:hierarchicalSubject was removed, but mwg-rs:Regions are not removed

Expected behavior

The picture should not contain mwg-rs:Regions.

Logfile | Screenshot | Screencast

Brigitte_Bardot_in_Rome_1969 exported_with_no_metadata screenshot_Eye-of-Gnome_showing-mwg-rs-Regions

Commit

No response

Where did you obtain darktable from?

distro packaging

darktable version

4.8.1

What OS are you using?

Linux

What is the version of your OS?

Debian sid (testing), Darktable 4.8.1 from the Debian repository

Describe your system?

No response

Are you using OpenCL GPU in darktable?

None

If yes, what is the GPU card and driver?

No response

Please provide additional context if applicable. You can attach files too, but might need to rename to .txt or .zip

No response

oj-random commented 3 weeks ago

Some additional info on mwg-rs:Regions

I think if you want that tag removed, you can use exiv2 or exiftool to remove it from the source file or from the export.

Yes, this is possible.

Is it the expectation of the user to remove tags manually if the settings dialog "gives the promise" that EXIF, IPTC and XMP tags are removed?

As you wrote, there are tons of meta data in a picture. In my tests all is removed quite reliably except the regions (faces). This is somewhat unexpected given the fact that this is quite sensitive data.

When I looked into pictures taken with an iPhone... The iPhone will write faces found in the pictures as mwg-rs:Regions (as in the files attached) into the picture. (Instead of a name you will find an ID.) Why do the regions (faces) survive the export when anything else is removed that the iPhone wrote into the picture?

There seems to be an exception: The date origin (date/time a picture was taken) is never removed in my tests.

oj-random commented 3 weeks ago

Some other hints on MWG tags...

I have no examples for exiv2, but for exiftool.

Exiftool reads MVG tags like this

exiftool -use MWG DateTimeOriginal -modifydate -Description -Title -Headline -Keywords path-to-your/image.jpg

MVG tags can be written like this

exiftool -use MWG -overwrite_original -DateTimeOriginal="2024:01:12 18:05:26+01:00" -modifydate=now -Description="your long comments" -Title="your title" -Headline="your title" -Keywords="bird,sun,coast" path-to-your/image.jpg

Regions can be written like this

exiftool -RegionInfo="{AppliedToDimensions={W=4288,H=2848,Unit=pixel},RegionList=[{Area={W=0.15,H=0.17,X=0.3,Y=0.4,Unit=normalized},Description=A beauty,Name=Brigitte Bardot,Type=Face},{Area={W=0.06,H=0.09,X=0.5,Y=0.6,Unit=normalized},Name=Anna,Description=some girl}]}" path-to-your/image.jpg.xmp

All metadata can be stripped with

exiftool -overwrite_original -all= path-to-your/image.jpg

That's what I would expect if all checkboxes are unchecked in the export settings.

ptilopteri commented 3 weeks ago

probably lost cause as every camera manufacturer will have their own special xmp tags. really bloat dt code to try to handle them all.

-- (paka)Patrick Shanahan Plainfield, Indiana, USA @ptilopteri facebook/ptilopteri Photos: http://wahoo.no-ip.org/piwigo paka @ IRCnet oftc

oj-random commented 3 weeks ago

probably lost cause as every camera manufacturer will have their own special xmp tags.

I thing it is not just "their own special xmp tags" because MWG tags are defined as a standard, https://web.archive.org/web/20180919181934/http://www.metadataworkinggroup.org/pdf/mwg_guidance.pdf

... and some big manufactures like Apple use this standard. IPhones alone will give thousands of users that will import millions of pictures into Darktable databases. And, there is probably a reason why Exiftool is supporting this as well.

Regions in pictures are not only used to mark faces. Regions are used to store detected objects too. MWG tags have a broader use case in times of machine learning ("AI").

Speaking for my own... I can cope with it if those kind of tags are not removed because I am aware of it. (Still I have to use an extra tool what I would try to avoid.) BUT: Most user will not be aware of it, trust Darktable, and will not expect to give away sensitive data to strangers and data collecting companies when the deliberately make choice to remove all metadata from exported pictures. At least I was surprised to see it still there.

Anyway. Cheers.

oj-random commented 3 weeks ago

How can this be labeled as feature request?

oj-random commented 3 weeks ago

then I dont see why not merging it.

I would implement it BUT I never coded C or C++ :-(

I would even go further with the idea...

I would be keen to contribute the face detection stuff with a Python script that can be started in the background.

Is anybody willing to participate in this effort, contribute to the C/C++ side, UI and stuff like this? Or, does anybody know somebody who could be asked?

I am aware of the existing face detection lua addon. I think of something like this functionality directly integrated in Darktable, https://codeberg.org/ojrandom/ddc Beside the better functionality and integration into the UI there are better detectors (e.g. yolov8) and models (e.g. Facenet512) available. The program https://codeberg.org/ojrandom/ddc writes "lr:hierarchicalSubject" into the xmp files. Darktable is able to read and filter this kind of tag.

wpferguson commented 3 weeks ago

Face detection is part of DigiKam and I personally don't think it should be added to dt.

Here's the discussion about adding face recognition to darktable just as a script, https://github.com/darktable-org/lua-scripts/pull/100. I'm not sure adding it to core will be accepted.

darktable is a raw processor. I did a quick search of the major camera brands and none of them seem to support or generate these tags. darktable uses an external project, exiv2, to gather metadata from the raw files. If exiv2 doesn't supply it, then darktable probably doesn't support it.

If you want to open a feature request, then I would close this issue and open a new one. That way you can arrange the information to support the feature instead of making someone read through all the conversation to dig it out.

One thought in favor of it is that it would make a nice input to the censorize module and allow automatic censoring of faces.

oj-random commented 3 weeks ago

Thanks for providing the link to the discussion https://github.com/darktable-org/lua-scripts/pull/100 Interesting to know.

I did a quick search of the major camera brands and none of them seem to support or generate these tags

Attached a screenshot showing the regions that a picture contains that was taken by an iPhone - and that have survived an export of Darktable (The program is EOG, Eye of Gnome) Why? Because I am not certain if we do talk about the same thing. There are region tags in the picture, written by the iPhone. But true I would not consider to use Darktable to "process" an iPhone image. I do process only RAW image with Darktable. In the case of iPhone pictures I only use Darktable to search/filter for the pictures, write some tags and description into it, view it on the map and so on. Looking from the perspective that Darktable is mainly a RAW processor: Yes, agreed iPhone is not a major camera brand.

If you want to open a feature request, then I would close this issue and open a new one. That way you can arrange the information to support the feature instead of making someone read through all the conversation to dig it out.

One thought in favor of it is that it would make a nice input to the censorize module and allow automatic censoring of faces.

This is interesting. I was not aware of this module. Indeed, the MWG tags could be used to help anonymize pictures for example. Do you have something like this in mind for a new feature request?

If you have this in mind, yes I could open a feature request. State of the art face detectors like retinaface or yolov8 do a very good job in finding faces.

Back to the main topic... Still there are the region tags that survive the export. What about this? Issue, feature request? Close the issue because at least some members of the project know about it? I do not know. Please decide! (I personally have no strong feeling.)

iphone_picture_containing_faces

wpferguson commented 3 weeks ago

Still there are the region tags that survive the export. What about this? Issue, feature request? Close the issue because at least some members of the project know about it? I do not know. Please decide! (I personally have no strong feeling.)

The bigger issue here is how should darktable treat existing metadata in "processed images" (i.e. jpg). Do we strip it on import? Right now we take what we want and ignore the rest (I think). When we export we read the existing image, apply the edits, attach/don't attach the metadata that darktable knows about, then export the completed image.

I don't have an answer for this. Perhaps leave it open for awhile and see if others comment.