uniuuu / zotprime

Fully packaged on-premise Zotero platform
https://www.zotero.org
GNU General Public License v3.0
71 stars 7 forks source link

ZoteroObjectUploadError:highlight annotation must be a PDF attachment #45

Open Amniotic3 opened 2 weeks ago

Amniotic3 commented 2 weeks ago

I checked the local SQLite database, looking for this PDF document, under the itemAttachments form, the value of contentType is application/pdf, check the relevant code, there seems to be no obvious logic error, special report this bug, affecting PDF batch reading function.

https://github.com/uniuuu/dataserver/blob/172584aa38a17a7e6ecca51e2bbc25b00450fad9/model/Item.inc.php#L1477

// Note, highlight, and underline supported for PDFs, EPUBs, and snapshots
if (in_array($this->annotationType, ["note", "highlight", "underline"])) {
if (!in_array($parentItem->attachmentContentType, ['application/pdf', 'application/epub+zip', 'text/html'])) {
throw new Exception(
// TEMP
//"Parent item $parentItem->libraryKey of $this->annotationType annotation must be a PDF, EPUB, or HTML attachment",
"Parent item $parentItem->libraryKey of $this->annotationType annotation must be a PDF attachment",
Z_ERROR_INVALID_INPUT
);
}
}
uniuuu commented 2 weeks ago

Hi @Amniotic3 Good day. I suggest to post in upstream. However they're prefer all to be posted in forum https://forums.zotero.org/discussions

You can see the original repository has the same code: https://github.com/zotero/dataserver/blob/master/model/Item.inc.php#L1477

https://github.com/uniuuu/dataserver/blob/172584aa38a17a7e6ecca51e2bbc25b00450fad9/model/Item.inc.php#L1477 This above is forked one so code is one in one with upstream.

Custom changes made via files in https://github.com/uniuuu/zotprime/tree/development/stack/dataserver/config

Could you please share how to reproduce this error?

Amniotic3 commented 1 week ago

@uniuuu Good afternoon, I've been busy getting continuing education credits lately, so I apologize for not getting back in a timely manner.

I'm using a client-side build under the ZotPrime 2.8.2-rc/production branch, and after logging into my admin account, clicking on the sync button, then dragging and dropping a pdf copy of the journal in, reading it in zotero's native reader, adding highlighted notes, and then clicking on sync, and the error comes up.

Amniotic3 commented 1 week ago

fix this bug.

look the log

[Sat Nov 23 08:00:46.779931 2024] [php:notice] [pid 2516:tid 2516] [client 10.5.5.1:33522] [DEBUG]Executing SQL: SELECT mimeType FROM itemAttachments WHERE itemID=? with itemID=3847
[Sat Nov 23 08:00:46.779959 2024] [php:notice] [pid 2516:tid 2516] [client 10.5.5.1:33522] [DEBUG]Query Result mimeType: application/pdf
[Sat Nov 23 08:00:46.779994 2024] [php:warn] [pid 2516:tid 2516] [client 10.5.5.1:33522] PHP Warning:  iconv(): Wrong encoding, conversion from "UTF-8" to "ASCII//IGNORE" is not allowed in /var/www/zotero/model/Item.inc.php on line 3177
[Sat Nov 23 08:00:46.780139 2024] [php:notice] [pid 2516:tid 2516] [client 10.5.5.1:33522] [ERROR] Invalid mimeType format: . Setting to empty.
[Sat Nov 23 08:00:46.783755 2024] [php:notice] [pid 2516:tid 2516] [client 10.5.5.1:33522] [DEBUG]Parent item details: Zotero_Item Object ...
[Sat Nov 23 08:00:46.783799 2024] [php:notice] [pid 2516:tid 2516] [client 10.5.5.1:33522] [DEBUG]Parent item attachmentContentType:
[Sat Nov 23 08:00:46.783809 2024] [php:notice] [pid 2516:tid 2516] [client 10.5.5.1:33522] [DEBUG]Invalid attachment type:  for annotationType: highlight
[Sat Nov 23 08:00:46.784476 2024] [php:notice] [pid 2516:tid 2516] [client 10.5.5.1:33522] Parent item 1/TZD7L6UZ of highlight annotation must be a PDF attachment new in /var/www/zotero/model/Item.inc.php:1490 (POST /users/1/items) (390762127e)

question is here:

[Sat Nov 23 08:00:46.779994 2024] [php:warn] [pid 2516:tid 2516] [client 10.5.5.1:33522] PHP Warning:  iconv(): Wrong encoding, conversion from "UTF-8" to "ASCII//IGNORE" is not allowed in 

https://github.com/uniuuu/dataserver/blob/6ed5455e6b2c7a23d0c4c547c72e2486a596d469/model/Item.inc.php#L3136

    /**
     * Get the MIME type of an attachment (e.g. 'text/plain')
     */
    private function getAttachmentMIMEType() {
        if (!$this->isAttachment()) {
            trigger_error("attachmentMIMEType can only be retrieved for attachment items", E_USER_ERROR);
        }

        if ($this->attachmentData['mimeType'] !== null) {
            return $this->attachmentData['mimeType'];
        }

        if (!$this->id) {
            return '';
        }

        $sql = "SELECT mimeType FROM itemAttachments WHERE itemID=?";
        $stmt = Zotero_DB::getStatement($sql, true, Zotero_Shards::getByLibraryID($this->libraryID));
        $mimeType = Zotero_DB::valueQueryFromStatement($stmt, $this->id);
        if (!$mimeType) {
            $mimeType = '';
        }

        // TEMP: Strip some invalid characters
        $mimeType = iconv("UTF-8", "ASCII//IGNORE", $mimeType);
        $mimeType = preg_replace('/[^\x{0009}\x{000a}\x{000d}\x{0020}-\x{D7FF}\x{E000}-\x{FFFD}]+/u', '', $mimeType);

        $this->attachmentData['mimeType'] = $mimeType;
        return $mimeType;
    }

rewrite to this:

private function getAttachmentMIMEType() {
    if (!$this->isAttachment()) {
        trigger_error("attachmentMIMEType can only be retrieved for attachment items", E_USER_ERROR);
    }

    if ($this->attachmentData['mimeType'] !== null) {
        return $this->attachmentData['mimeType'];
    }

    if (!$this->id) {
        return '';
    }

    $sql = "SELECT mimeType FROM itemAttachments WHERE itemID=?";
    $stmt = Zotero_DB::getStatement($sql, true, Zotero_Shards::getByLibraryID($this->libraryID));
    $mimeType = Zotero_DB::valueQueryFromStatement($stmt, $this->id);

    if (!$mimeType) {
        $mimeType = '';
    } else {
        try {
            $mimeType = mb_convert_encoding($mimeType, "ASCII", "UTF-8");//fix bug
        } catch (Exception $e) {
            error_log("[ERROR] iconv conversion failed for mimeType: $mimeType. Error: " . $e->getMessage());
        }

        $mimeType = preg_replace('/[^\x{0009}\x{000a}\x{000d}\x{0020}-\x{D7FF}\x{E000}-\x{FFFD}]+/u', '', $mimeType);

        if (!preg_match('/^[a-zA-Z0-9\-\.]+\/[a-zA-Z0-9\-\.]+$/', $mimeType)) {
            error_log("[ERROR] Invalid mimeType format: $mimeType. Setting to empty.");
            $mimeType = '';
        }
    }

    $this->attachmentData['mimeType'] = $mimeType;
    return $mimeType;
}
Amniotic3 commented 1 week ago

Checking the server-side database, I can find that mimeType has a specific data type,

but after the // TEMP: Strip some invalid characters part of the processing, the data is replaced with null,

when we add an annotation for uploading and synchronizing // Annotation will prompt an error report,

because the // Here. in_array($parentItem->attachmentContentType, ['application/pdf', 'application/epub+zip', 'text/html']) , attachmentContentType is already null.

So whether you are annotating a PDF document or not, it will report an error that the parent item is not a PDF document.

https://github.com/uniuuu/dataserver/blob/6ed5455e6b2c7a23d0c4c547c72e2486a596d469/model/Item.inc.php#L1476