xperseguers / t3ext-extractor

TYPO3 Extension extractor
https://extensions.typo3.org/extension/extractor
GNU General Public License v2.0
14 stars 23 forks source link

PhpExtractor returns various fields (e.g. title) with binary data for encrypted PDFs, error on UPDATE (incorrect string value) #76

Open sypets opened 6 months ago

sypets commented 6 months ago

for example: Encrypted PDF file

image

An exception occurred while executing 'UPDATE `sys_file_metadata` SET `pid` = ?, `tstamp` = ?, `crdate` = ?, `cruser_id` = ?, `sys_language_uid` = ?, `l10n_parent` = ?, `l10n_diffsource` = ?, `t3ver_oid` = ?, `t3ver_wsid` = ?, `t3ver_state` = ?, `t3ver_stage` = ?, `t3ver_count` = ?, `t3ver_tstamp` = ?, `t3ver_move_id` = ?, `t3_origuid` = ?, `file` = ?, `title` = ?, `width` = ?, `height` = ?, `description` = ?, `alternative` = ?, `categories` = ?, `visible` = ?, `status` = ?, `keywords` = ?, `caption` = ?, `creator_tool` = ?, `download_name` = ?, `creator` = ?, `publisher` = ?, `source` = ?, `location_country` = ?, `location_region` = ?, `location_city` = ?, `latitude` = ?, `longitude` = ?, `ranking` = ?, `content_creation_date` = ?, `content_modification_date` = ?, `note` = ?, `unit` = ?, `duration` = ?, `color_space` = ?, `pages` = ?, `language` = ?, `fe_groups` = ?, `copyright` = ?, `l10n_state` = ?, `camera_make` = ?, `camera_model` = ?, `camera_lens` = ?, `shutter_speed` = ?, `focal_length` = ?, `exposure_bias` = ?, `white_balance_mode` = ?, `iso_speed` = ?, `aperture` = ?, `flash` = ?, `altitude` = ? WHERE `uid` = ?' with params [0, 1702099431, 1473755451, 434, 0, 0, \"\", 0, 0, 0, 0, 0, 0, 0, 0, 169738, \"\\xbc\\xcc\\xba\\x49\\xb5\\x1f\\x7d\\x65\\x70\\x2e\\x9f\\x35\\x7a\\x58\\x0f\\x6a\\xfd\\xf1\\xa8\\x39\\x0c\\x50\\x76\\xa2\\x9c\\x5d\\x58\\x83\\x08\\xcd\\x67\", 0, 0, null, null, 0, 1, \"\", \"\", \"\", \"\\xb0\\xc6\\xab\\x54\\xb8\\x00\\x66\\x23\\x40\\x67\\xbb\\x2e\\x61\\x50\\x43\\x22\\xaf\\x99\\xd1\\x37\\x6f\\x2d\\x63\\xb0\\x84\\x3a\\x09\\xc3\\x08\\xcd\\x73\\x43\\xd8\", \"\", \"\\xba\\xc0\\xb7\\x54\", \"\", \"\", \"\", \"\", \"\", \"0.00000000000000\", \"0.00000000000000\", 0, 0, 0, \"\", \"\", 0, \"\", 0, \"\", \"\", null, null, \"\", \"\", \"\", \"\", 0, \"\", \"\", 0, 0, 0, 0, 183924]:

Incorrect string value: '\\xBC\\xCC\\xBAI\\xB5\\x1F...' for column `typo3`.`sys_file_metadata`.`title` at row 1

Configuration

image

I guess, I can solve this on my end by deactivating the PHP extractor, but ideally, it would not try to extract anything in this case.

xperseguers commented 6 months ago

By encrypted PDF, you mean one where there's a password to modify it (free to read)? Or to read it?

sypets commented 6 months ago

It is not possible to read the PDF at all without a password.