xperseguers / t3ext-extractor

TYPO3 Extension extractor
https://extensions.typo3.org/extension/extractor
GNU General Public License v2.0
15 stars 24 forks source link

Replacing a file does not re-extract the metadata from the new file #88

Open baschny opened 3 months ago

baschny commented 3 months ago

Problem

"Replaces" files do not get the new metadata extracted.

How to reproduce

  1. Upload a file with metadata (i.e. copyright information). This data is extracted and put into sys_file_metadata.copyright
  2. The editor discovers that the copyright information is wrong, fixes the original file locally (manipulates the Exif data)
  3. Then uses the "Replace File" functionality in the TYPO3 backend filelist module: Bildschirmfoto 2024-07-12 um 11 45 51
  4. The file is replaced: but the metadata is not extracted again and still the old copyright information is used.

Debugging

Debugging shows that this happens due to this workaround, which was introduced here to cope with "moving" files: https://github.com/xperseguers/t3ext-extractor/commit/001a773e29a2a23d95745523b3d78dc8d153d85c due to this problem: https://forge.typo3.org/issues/91168

Reasoning

In our project, due to strict asset regulatory reasons, we have disabled all metadata to be edited in the backend ("readonly") and rely solely on the metadata from the files themselves, so that we need them to be always up-to-date.

So we do not have the problem of "overwriting" metadata which might have been manipulated manually, but we indeed want this to happen. Maybe introduce a switch to make this workaround optional, or have a way to cope with the "Replace" functionality differently than what happens in a "Move file" situation.

baschny commented 3 months ago

Another idea would be to introduce a new extractor_hash field recording a hash of extracted information (i.e. a hash per field), so that on a later move or replace it can decide if the field was manipulated by an editor in the backend or not, and just "replace" information which was also originally extracted.