matiasdelellis / facerecognition

Nextcloud app that implement a basic facial recognition system.
GNU Affero General Public License v3.0
508 stars 46 forks source link

Read face regions from XMP information #321

Open escoand opened 4 years ago

escoand commented 4 years ago

I created a proof-of-concept to read the XMP face regions. Not sure where and how is the best way to integrate this in the background job. Tried to create a similar output as IModel->detectFaces - in my case it is:

$ php xmp.php /path/to/my/*.jpg
# /path/to/my/image.jpg
Array
(
    [0] => Array
        (
            [left] => 1063
            [top] => 752
            [right] => 1538
            [bottom] => 1270
            [name] => NameXYZ
        )
)

I've created the XMP information with digiKam and saved them directly into the file. Should also be no problem to read them from a separate *.xmp file. The XMP XML document looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="XMP Core 4.4.0-Exiv2">
  <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
    <rdf:Description xmlns:GCamera="http://ns.google.com/photos/1.0/camera/" xmlns:MP="http://ns.microsoft.com/photo/1.2/" xmlns:MPRI="http://ns.microsoft.com/photo/1.2/t/RegionInfo#" xmlns:MPReg="http://ns.microsoft.com/photo/1.2/t/Region#" xmlns:MicrosoftPhoto="http://ns.microsoft.com/photo/1.0/" xmlns:acdsee="http://ns.acdsee.com/iptc/1.0/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:digiKam="http://www.digikam.org/ns/1.0/" xmlns:lr="http://ns.adobe.com/lightroom/1.0/" xmlns:mediapro="http://ns.iview-multimedia.com/mediapro/1.0/" xmlns:mwg-rs="http://www.metadataworkinggroup.com/schemas/regions/" xmlns:stArea="http://ns.adobe.com/xmp/sType/Area#" rdf:about="" GCamera:SpecialTypeID="BestShotType" GCamera:BurstID="17e0c273-8c6d-4392-84b9-e9d24f85c02d" GCamera:BurstPrimary="1" acdsee:categories="&lt;Categories&gt;&lt;Category Assigned=&quot;0&quot;&gt;Personen&lt;Category Assigned=&quot;1&quot;&gt;NameXYZ&lt;/Category&gt;&lt;/Category&gt;&lt;/Categories&gt;">
      <mwg-rs:Regions rdf:parseType="Resource">
        <mwg-rs:RegionList>
          <rdf:Bag>
            <rdf:li>
              <rdf:Description mwg-rs:Name="NameXYZ" mwg-rs:Type="Face">
                <mwg-rs:Area stArea:x="0.501736" stArea:y="0.520062" stArea:w="0.183256" stArea:h="0.266461" stArea:unit="normalized" />
              </rdf:Description>
            </rdf:li>
          </rdf:Bag>
        </mwg-rs:RegionList>
      </mwg-rs:Regions>
      <MP:RegionInfo rdf:parseType="Resource">
        <MPRI:Regions>
          <rdf:Bag>
            <rdf:li MPReg:Rectangle="0.410108, 0.386831, 0.183256, 0.266461" MPReg:PersonDisplayName="NameXYZ" />
          </rdf:Bag>
        </MPRI:Regions>
      </MP:RegionInfo>
      <digiKam:TagsList>
        <rdf:Seq>
          <rdf:li>Personen/NameXYZ</rdf:li>
        </rdf:Seq>
      </digiKam:TagsList>
      <MicrosoftPhoto:LastKeywordXMP>
        <rdf:Bag>
          <rdf:li>Personen/NameXYZ</rdf:li>
        </rdf:Bag>
      </MicrosoftPhoto:LastKeywordXMP>
      <lr:hierarchicalSubject>
        <rdf:Bag>
          <rdf:li>Personen|NameXYZ</rdf:li>
        </rdf:Bag>
      </lr:hierarchicalSubject>
      <mediapro:CatalogSets>
        <rdf:Bag>
          <rdf:li>Personen|NameXYZ</rdf:li>
        </rdf:Bag>
      </mediapro:CatalogSets>
      <dc:subject>
        <rdf:Bag>
          <rdf:li>NameXYZ</rdf:li>
        </rdf:Bag>
      </dc:subject>
    </rdf:Description>
  </rdf:RDF>
</x:xmpmeta>

My proof of concept script is this:

<?php

for($i = 1; $i < $argc; $i++) {
  print("# " . $argv[$i] . "\n");
  $faces = detectFaces($argv[$i]);
  print_r($faces);
}

function detectFaces(string $filename): array {
  $detectedFaces = [];

  $xmp = getXmpData($filename, 50000);
  if($xmp === NULL) return $detectedFaces;

  $xml = simplexml_load_string($xmp);
  $xml->registerXPathNamespace("rdf", "http://www.w3.org/1999/02/22-rdf-syntax-ns#");
  $xml->registerXPathNamespace("mwg", "http://www.metadataworkinggroup.com/schemas/regions/");
  $xml->registerXPathNamespace("mp", "http://ns.microsoft.com/photo/1.2/");
  $xml->registerXPathNamespace("mpri", "http://ns.microsoft.com/photo/1.2/t/RegionInfo#");

  // read MetaDataWorkingGroup size
  $dims = $xml->xpath("rdf:RDF/rdf:Description/mwg:Regions/mwg:AppliedToDimensions");
  if(!empty($dims)) {
    $attrs = $dims[0]->attributes("http://ns.adobe.com/xap/1.0/sType/Dimensions#");
    $width = intval($attrs["w"]);
    $height = intval($attrs["h"]);
  }

  // read raw img size
  else {
    $info = getimagesize($filename);
    $width = $info[0];
    $height = $info[1];
  }

  // read MetaDataWorkingGroup info
  $nodes = $xml->xpath("rdf:RDF/rdf:Description/mwg:Regions/mwg:RegionList/rdf:Bag/rdf:li/rdf:Description[@mwg:Type='Face']/mwg:Area");
  foreach($nodes as $node) {
    $attrs = $node->attributes("http://ns.adobe.com/xmp/sType/Area#");
    $x1 = round((floatval($attrs["x"]) - floatval($attrs["w"]) / 2) * $width);
    $y1 = round((floatval($attrs["y"]) - floatval($attrs["h"]) / 2) * $height);
    $x2 = $x1 + round(floatval($attrs["w"]) * $width);
    $y2 = $y1 + round(floatval($attrs["h"]) * $height);
    $name = strval($node->xpath("parent::*")[0]->attributes("http://www.metadataworkinggroup.com/schemas/regions/")["Name"]);
    $detectedFaces[] = array(
      "left" => $x1,
      "top" => $y1,
      "right" => $x2,
      "bottom" => $y2,
      "name" => $name
    );
  }

  // read Microsoft Photo info
  $nodes = $xml->xpath("rdf:RDF/rdf:Description/mp:RegionInfo/mpri:Regions/rdf:Bag/rdf:li");
  foreach($nodes as $node) {
    $attrs = $node->attributes("http://ns.microsoft.com/photo/1.2/t/Region#");
    $dims = preg_split("/,\s*/", strval($attrs["Rectangle"]));
    $x1 = round(floatval($dims[0]) * $width);
    $y1 = round(floatval($dims[1]) * $height);
    $x2 = $x1 + round(floatval($dims[2]) * $width);
    $y2 = $y1 + round(floatval($dims[3]) * $height);
    $name = strval($attrs["PersonDisplayName"]);
    $detectedFaces[] = array(
      "left" => $x1,
      "top" => $y1,
      "right" => $x2,
      "bottom" => $y2,
      "name" => $name
    );
  }

  return array_map("unserialize", array_unique(array_map("serialize", $detectedFaces)));
}

function getXmpData(string $filename, int $chunkSize): ?string {
    if (!is_int($chunkSize)) {
        throw new RuntimeException('Expected integer value for argument #2 (chunkSize)');
    }

    if ($chunkSize < 12) {
        throw new RuntimeException('Chunk size cannot be less than 12 argument #2 (chunkSize)');
    }

    if (($file_pointer = fopen($filename, 'r')) === FALSE) {
        throw new RuntimeException('Could not open file for reading');
    }

    $startTag = '<x:xmpmeta';
    $endTag = '</x:xmpmeta>';
    $buffer = NULL;
    $hasXmp = FALSE;

    while (($chunk = fread($file_pointer, $chunkSize)) !== FALSE) {

        if ($chunk === "") {
            break;
        }

        $buffer .= $chunk;
        $startPosition = strpos($buffer, $startTag);
        $endPosition = strpos($buffer, $endTag);

        if ($startPosition !== FALSE && $endPosition !== FALSE) {
            $buffer = substr($buffer, $startPosition, $endPosition - $startPosition + 12);
            $hasXmp = TRUE;
            break;
        } elseif ($startPosition !== FALSE) {
            $buffer = substr($buffer, $startPosition);
            $hasXmp = TRUE;
        } elseif (strlen($buffer) > (strlen($startTag) * 2)) {
            $buffer = substr($buffer, strlen($startTag));
        }
    }

    fclose($file_pointer);
    return ($hasXmp) ? $buffer : NULL;
}
escoand commented 4 years ago

BTW: This is an result of the discussion in #319.

matiasdelellis commented 4 years ago

Hi @escoand

Of course I thank you for the initiative. just that can help in the future implementation, however I warn you that there is no way that accept any patches about it anytime soon.

We can use the regions to take suggestions of rectangles for the models, but we still have to search for the face in the model used (WE CANNOT USE THE REGION DIRETELY. See https://github.com/davisking/dlib/issues/2093), search for the landmarks, and get the descriptor to compare. For this we need also changes in pdlib.

Thanks again,

escoand commented 4 years ago

OK, and what do you think of the other way around? Saving the XMP data to the files?

matiasdelellis commented 4 years ago

Hi @escoand

OK, and what do you think of the other way around? Saving the XMP data to the files?

As I told you in another issue, I guess this is the best solution, but still have to think about it very well. The only consideration on this point is that it should be enabled by the user. (Any file moderation must be approved by the users ..)

Regards, Matias.

cliffalbert commented 4 years ago

Concerning XMP data reads.

Would it be an idea to if faceregion detected by facerecognition matches (or falls in) the coordinates in the Region tag that the RegionName would be used to name the cluster ? I don't know how much work this would be, just an idea.