xperseguers / t3ext-extractor

TYPO3 Extension extractor
https://extensions.typo3.org/extension/extractor
GNU General Public License v2.0
14 stars 23 forks source link

exiftool "ColorMode" and "ColorSpaceData" are passed through un-normalized, despite being invalid values #83

Open LeoniePhiline opened 3 months ago

LeoniePhiline commented 3 months ago

exiftool -j may extract "ColorMode" and "ColorSpaceData" values which are not fit to be placed into sys_file_metadata.color_space unaltered.

Example:

exiftool -j "exampe-sRGB-black-and-white-file.jpg" \
    | jq '[.[] | {ColorMode, ColorSpaceData, ColorSpace}]' 
[
  {
    "ColorMode": "Grayscale",
    "ColorSpaceData": "GRAY",
    "ColorSpace": "sRGB"
  }
]

extractor defines color_space mapping as follows:

jq '.[] | select(.FAL == "color_space")' "extractor/Configuration/Services/ExifTool/default.json"
{
  "FAL": "color_space",
  "DATA": [
    "ColorMode",
    "ColorSpaceData",
    "ColorSpace->Causal\\Extractor\\Utility\\ColorSpace::normalize"
  ]
}

Since \Causal\Extractor\Service\Extraction\AbstractExtractionService::remapServiceOutput breaks upon a non-null $value, the exiftool value "ColorMode" = "Grayscale" is extracted and passed back to the TYPO3 metadata extraction service, where it is used as parameter for an INSERT INTO sys_file_metadata.

However, the sys_file_metadata.color_space field is a VARCHAR(4), and "Grayscale" does not fit, causing an error in strict mode.

Furthermore, "Grayscale" is an invalid value. According to SYSEXT:filemetadata/Configuration/TCA/Overrides/sys_file_metadata.php, the correct grayscale color_space value would be "grey".

Thus, ColorMode = "Grayscale" and ColorSpaceData = "GRAY" must be normalized to the value "grey".

To my mind, this should be handled by the ColorSpace utility, using a configuration like ...:

{
  "FAL": "color_space",
  "DATA": [
    "ColorMode->Causal\\Extractor\\Utility\\ColorSpace::normalize",
    "ColorSpaceData->Causal\\Extractor\\Utility\\ColorSpace::normalize",
    "ColorSpace->Causal\\Extractor\\Utility\\ColorSpace::normalize"
  ]
}

... and adjusting Causal\Extractor\Utility\ColorSpace::normalize to match on strings starting (lowercased) with "gray" or "grey" and replacing them with the canonical "grey" value.