barbushin / php-imap

Manage mailboxes, filter/get/delete emails in PHP (supports IMAP/POP3/NNTP)
MIT License
1.65k stars 459 forks source link

Embedded images missing #694

Open oioix opened 1 year ago

oioix commented 1 year ago

PHP IMAP version: 5.0.1 PHP Version: 8.1 Type of execution: CLI or Web Server

Found three bugs with attached images that are embedded. Here is the description and a fix proposal.

Bug Nr. 1) No matter if $mailbox->setAttachmentsIgnore() is set to true or false. In both cases the images are broken. Created email: Screenshot 2022-12-28 | 00 25 29

Output of received email with php-imap: Screenshot 2022-12-28 | 00 26 33

The reason is that the nice function embedImageAttachments() in IncomingMail.php (Line 208) which had been written to autmatically show the embedded images is never used.

I found a fix for that. But then I found ...

Bug Nr. 2 If this function embedImageAttachments() is used, images will appear but they get again broken as soon $mailbox->setAttachmentsIgnore() ist set to true.

I could fix it, but then I found ...

Bug Nr. 3 Instead of displaying three different images the first image is display three times. The sources are not correctly assigned to the img tags. I found the reason for this. Surprisingly $attachments = $this->getAttachments(); in IncomingMail.php (around Line 220) returns always only one (the very first) attachment. This is strange, since it should be an array with all attchments in it. I could not find out why, but I'm absolutely sure and proofed it well, it contains only one key(!). So output will surely fail with multiple attachments. But also the rest of the code was not right. So I had to change a bit more here.

Fixed result: Screenshot 2022-12-28 | 02 16 12

I'd herewith like to share the solution:

CHANGE A) In IncomingMail.php completely exchange the function embedImageAttachments() (very end in php) with this code:

  /**
     * Embed inline image attachments as base64 to allow for
     * email html to display inline images automatically.
     */
    public function embedImageAttachments($ignoreAttachments = false, $attachments = [false]): void
    {
        $fetchedHtml = $this->__get('textHtml');

        // strip out embedded images ( detected by src="cid: in img tag) if ignore attachments is set
        if ($ignoreAttachments) $this->textHtml = preg_replace('/(<IMG(.+?)src=\"cid\:(.+?)\>)/mi', '', $this->textHtml);

        // attachments not ignored
        if (!$ignoreAttachments) {

            \preg_match_all("/\bcid:[^'\"\s]{1,256}/mi", $fetchedHtml, $matches);

            if (isset($matches[0]) && \is_array($matches[0]) && \count($matches[0])) {
                /** @var list<string> */
                $matches = array_unique($matches[0]); // <- remove duplicates

                $cidAry = [];
                $cid    = '';

                if (is_array($matches) && $matches)
                foreach ($matches as $match) {
                   $cidAry[] = \str_replace('cid:', '', $match);
                }

                if (is_array($attachments) && $attachments)
                foreach ($attachments as $attachment) {
                    /**
                     * Inline images can contain a "Content-Disposition: inline", but only a "Content-ID" is also enough.
                     * See https://github.com/barbushin/php-imap/issues/569.
                     */
                    if (in_array($attachment->contentId, $cidAry)) {
                        $cid         = $attachment->contentId;
                        $contents    = $attachment->getContents();
                        $contentType = $attachment->getFileInfo(FILEINFO_MIME_TYPE);

                        if (!\strstr($contentType, 'image')) {
                            continue;
                        } 
                        elseif (!\is_string($attachment->id)) {
                            throw new InvalidArgumentException('Argument 1 passed to '.__METHOD__.'() does not have an id specified!');
                        }

                        $base64encoded = \base64_encode($contents);
                        $replacement = 'data:'.$contentType.';base64, '.$base64encoded;

                        $this->textHtml = \str_replace('src="cid:'.$cid.'"', 'src="'.$replacement.'"', $this->textHtml);

            /** TODO (as enhancement): it would be a good idea to treat the attachment removal of subsequent 
             * line like an settable option since the attached inline images will not be saved to disk like all 
             * the other attachments.
             *
             * Thus commented subsequent line. But the attachment removal works! 
             * Just uncomment it, if you like to show the images inline AND also save them as attachments to disk.
             */
                        //$this->removeAttachment($attachment->id);
                    }
                }
            }
        }
    }

CHANGE B) To init the function and to submit the ignore setting and also the correct attachments array. In Mailbox.php in the function getMail() around the Line Nr.1331/1332:

Change end of function from

        return $mail;
}

to:

        if($mail->hasAttachments() == true) $mail->embedImageAttachments($this->getAttachmentsIgnore(), $mail->getAttachments());

        return $mail;
}
oioix commented 1 year ago

Relating to CHANGE A of my prior message:

I once more improved the function embedImageAttachments() since I realized, that emedded images can not be saved with its real name with a right click. The reason is the base64 encoding of the inline img data, which is used as img src but does not provide a filename like an ordinary file does.

Now a simple click (instead of right click) onto the image is enough and a prompt will appear to save the image with its real (but sanitized) name. You can still do the right click onto the image and select "save image as..". You will then regognize the difference.

    /**
     * Embed inline image attachments as base64 to allow for
     * email html to display inline images automatically.
     */
    public function embedImageAttachments($ignoreAttachments = false, $attachments = [false]): void
    {
        $fetchedHtml = $this->__get('textHtml');

        // strip out embedded images ( detected by src="cid: in img tag) if ignore attachments is set
        if ($ignoreAttachments) $this->textHtml = \preg_replace('/<IMG(.+?)(src=\"cid:(.+?)\")(.+?)\>/mi', '', $this->textHtml);

        // attachments not ignored
        if (!$ignoreAttachments) {

            \preg_match_all("/\bcid:[^'\"\s]{1,256}/mi", $fetchedHtml, $matches);

            if (isset($matches[0]) && \is_array($matches[0]) && \count($matches[0])) {

                $matches = \array_unique($matches[0]); // <- remove duplicates

                $cidAry = [];
                $cid    = '';

                if (\is_array($matches) && $matches)
                foreach ($matches as $match) {
                   $cidAry[] = \str_replace('cid:', '', $match);
                }

                $imgMatchesAry = $imgTagsAry = [];

                // start preparation for "Enable embedded img dowload 'on click'"
                \preg_match_all('/<IMG(.+?)src=\"cid:((.+?))\"(.+?)\>/si', $this->textHtml, $imgMatchesAry);

                if ($imgMatchesAry) {
                    $i = 0;
                    foreach ($imgMatchesAry[0] as $imgMatch) {
                        $imgTagsAry[$i]['imgTag'] = $imgMatch;
                        $i++;
                    }
                    $i = 0;
                    foreach ($imgMatchesAry[2] as $imgMatch) {
                        $imgTagsAry[$i]['imgCID'] = $imgMatch;
                        $i++;
                    }
                } // end preparation

                if (\is_array($attachments) && $attachments)
                foreach ($attachments as $attachment) {
                    /**
                     * Inline images can contain a "Content-Disposition: inline", but only a "Content-ID" is also enough.
                     * See https://github.com/barbushin/php-imap/issues/569.
                     */
                    if (in_array($attachment->contentId, $cidAry)) {
                        $cid         = $attachment->contentId;
                        $contents   = $attachment->getContents();
                        $contentType = $attachment->getFileInfo(FILEINFO_MIME_TYPE);

                        if (!\strstr($contentType, 'image')) {
                            continue;
                        } 
                        elseif (!\is_string($attachment->id)) {
                            throw new InvalidArgumentException('Argument 1 passed to '.__METHOD__.'() does not have an id specified!');
                        }

                        $base64encoded = \base64_encode($contents);
                        $b64replacement = 'data:'.$contentType.';base64, '.$base64encoded;

                        /**  
                         * Enable embedded img dowload 'on click'
                         * This wraps the image with an a-tag and so allows to download the image with a single click. 
                         * Furthermore with it can so be save with its real (sanitized) name.
                         * This can be helpful since the img-src is base64 encoded and thus a right-click-download saves  
                         * such images always with the same name 'index.extension'
                         */
                        $enableOnClickDl = true;
                        if($enableOnClickDl) {

                            $attachmentName = \basename($attachment->name);

                            $attachmentName = trim(\preg_replace(
                                '/[^\p{L}\s\d\-_,:\|\[\]\(\).\']|([\"\'\`]+)+|(\.[[:alnum:]]{3,4})(\.(.+?)+|\.|(.+?)\.)|'.
                                '^[\.\' ]+|([\.\/]+\s+)+|(\.\.)+/iux','_', $attachmentName),'./ ');

                            foreach ($imgTagsAry as $k=>$imgTag) {
                                if($cid == $imgTag['imgCID']) {

                                    $prepend = '<a href="'.$b64replacement.'" download="'.$attachmentName.'">';
                                    $append = '</a>';

                                    $this->textHtml = \str_replace($imgTag['imgTag'], $prepend.$imgTag['imgTag'].$append, $this->textHtml);
                                }
                            }
                        }

                        // replace src with base64 img
                        $this->textHtml = \str_replace('src="cid:'.$cid.'"', 'src="'.$b64replacement.'"', $this->textHtml);

                        /** TODO (as enhancement): it would be a good idea to treat the attachment removal of subsequent 
                         * line like an settable option since the attached inline images will not be saved to disk like all 
                         * the other attachments.
                         *
                         * Thus commented subsequent line. But the attachment removal works! 
                         * Just uncomment it, if you like to show the images inline AND also save them as attachments to disk.
                         */
                        //$this->removeAttachment($attachment->id);
                    }
                }
            }
        }
    }

image on click (this solution - improved behavior): Screenshot 2022-12-29 | 20 58 15

Right click onto the image and select "save image as.." (standard behavior): Screenshot 2022-12-29 | 21 23 46