xperseguers / t3ext-extractor

TYPO3 Extension extractor
https://extensions.typo3.org/extension/extractor
GNU General Public License v2.0
15 stars 24 forks source link

exception: Data too long for column 'color_space' #89

Open sypets opened 1 month ago

sypets commented 1 month ago

I am not sure if this is a core issue or an extractor issue. However, the problem occurs in ExifToolMetadataExtraction::extractMetaData

The problem is that the property "color_space" is evaluated to "Normal" although "sRGB" is in the exifdata and the field sys_file_metadata.color_space in the database is only 4 chars long. On writing the properties an exception is thrown. That the metadata is not written is not so severe, but that the height and width is not set has several ugly sideeffects.

We have been seeing this since Update to TYPO3 v12, did not see it before (though I can't say that it does not exists.

Impact

image

image

Debugging

public function extractMetaData(File $file, array $previousExtractedData = [])
    {
        $metadata = [];

        $extractedMetadata = $this->getExifToolService()->extractMetadata($file);
        if (!empty($extractedMetadata)) {
            $dataMapping = $this->getDataMapping($file);
            $metadata = $this->remapServiceOutput($extractedMetadata, $dataMapping);
            $this->processCategories($file, $metadata);
        }

'ColorSpace' => 'sRGB'

image

color_space='Normal'

image


file

exiftool Verkauf_UN_770_-_001.JPG  | grep -i "color space"
Color Space                     : sRGB

versions

sypets commented 1 month ago

workaround

mysql>ALTER TABLE sys_file_metadata MODIFY COLUMN color_space VARCHAR(10);``

chesio commented 1 month ago

I think this is very similar to #85.

There is a core issue for #85 already, but there does not seem to be a conclusion yet whether this is more of a core issue or something that should be taken care of on extension level or even on a per project basis: https://forge.typo3.org/issues/104872

sypets commented 1 month ago

Thanks @chesio - I will take a look. I took care of it for now by changing the db schema, but I assume others might be affected as well and it seems to be a more general problem, you mentioned #85 and the core issue, there is also #74, and there are other issues where the metadata has wrong datatype.

I commented in the core issue.

Also, I don't know where the string "Normal" comes from, sRGB seems to get replaced by the mapping.

sypets commented 1 month ago

I am using out-of-the-box extractor. I did not override any mapping (as far as I am aware).

I debugged some more. In AbstractExtractionService::remapServiceOutput, the following are used to map the color_space (in this order):

  1. ColorMode
  2. ColorSpaceData
  3. ColorSpace->Causal\Extractor\Utility\ColorSpace::normalize

With my test file, the value of ColorMode "Normal", this is what is used to write to sys_file_metadata.color_space (with width of 4 chars in DB).

exiftool color_space.jpg | grep -E "(Color Mode|Color Space)"
Color Mode                      : Normal
Color Space                     : sRGB

image

sypets commented 1 month ago

mapping:

 {
    "FAL": "color_space",
    "DATA": [
      "ColorMode",
      "ColorSpaceData",
      "ColorSpace->Causal\\Extractor\\Utility\\ColorSpace::normalize"
    ]
  },

https://github.com/xperseguers/t3ext-extractor/blob/master/Configuration/Services/ExifTool/default.json#L10

sypets commented 1 month ago

My suggestions:

xperseguers commented 1 week ago
  • check if it really makes sense to use ColorMode for color_space

Based on https://www.pantone.com/articles/color-fundamentals/color-models-explained, Color mode is actually "color model" and the system used to describe a color (RGB, CMYK, ...) whereas the color space is the mapping of real colors to the color model's particular values, e.g., sRGB and AdobeRGB are color spaces using RGB as color model/coding system.

=> Most probably it makes little sense to use ColorMode in the mapping and it should logically get dropped.