eneam / mboxviewer

A small but powerfull app for viewing MBOX files
Other
430 stars 24 forks source link

Feature request: Print to PDF with all attachments #50

Closed mikedepetris closed 9 months ago

mikedepetris commented 11 months ago

Hi, I wanted to print select e-mails exporting all attachments. Best would be to have a choice to obtain a PDF with clickable relative links to an "attachments" folder that contains all the file or even have them included in the PDF for some of the formats, that may be images or PDFs. To get the same result I exported all attachments in a folder. Then I used "Print to HTML", then I edited all file:/// absolute links to relative links. This way I get an html with a directory of relative linked attachments, it is still not a PDF. Did I explained it well enough?

zigm commented 11 months ago

Not sure what version you are running. Both v1.0.3.39 and v1.0.3.40 should support clickable links to attachments in the header.

However, different browsers handle links in PDF differently. Firefox doesn't seem to support links to local files. Chrome and Edge support links to local files in PDF. However, they behave differently.

In Chrome you can click on the link and it will open link in the same TAB. You need to click backward to go the the PDF file. You can also right click on the link and select "open link in new tab".

In Edge you need to right click on the link and select "open link in new tab".

In v1.0.3.40 links should be relative be default. You can change links to absolute by resetting "File -> General Options Config -> Relative Inline Attachments Path" option.

Let me know if that helps and/or you need different option.

mikedepetris commented 11 months ago

Thanks, I'm running 1.0.3.40 32bit Unicode

With "Relative Inline Attachments Path" checked I get the relative path for the links when I do "Print to HTML", but when I use "Print to PDF" I always get an absolute path.

Relative: "M_04.2a_EDU_modulo_proposta_attivita_e_preiscrizione_vrs1.pdf
"

It may be better not to start with "..\" or the file must stay in a subfolder.

It would also be helpful to have an option to avoid the target="_blank"

In the PDF it's always absolute like: /URI (file:///C:/mbox/UMBoxViewer/C/mbox/berenicegiordano.qualitas@gmail.com%20Tutti%20i%20messaggi%20compresi%20Spam%20e%20Cestino-mbox/AttachmentCache/20211022-203135-0000691%20M%2004.2a%20EDU%20modulo%20proposta%20attivita%20e%20preiscrizione_vrs1.pdf)>>

zigm commented 11 months ago

I need to look the code but I would expect the relative path in PDF too. PDF file is generated from HTML. When I print to PDF and select open folder option I can view HTML file which was used to print to PDF by Edge or Chrome. HTML files have relative links. Not sure how they end up as absolute in PDF file but I will double check. Looks incorrect.

Currently there is no sort of export option which would encapsulate all files in the single email specific folder for export to other users possibly. You need zip Attachment folder with Print folder.

Yes, path to attachments is relative to folder with PDF file. You need to open PDF from the folder housing PDF file.

Did not play with option other than the target="_blank". Not sure what no option will do.

Can I assume that links work for you as I described?

I will provide an update on absolute links in PDF file. How did you verify links are absolute?

mikedepetris commented 11 months ago

Yes it is all correct. To check the links in the PDF file I just opened the file with a basic viewer or editor for example notepad++ you can easily see the links as /URI.

zigm commented 11 months ago

I am afraid I don't have a good news as far as support for relative links in PDF. MBox Viewer relies on Chrome and/or Edge to convert HTML to PDF. It relies on headless mode support in Chrome and Edge as follow:

chrome --headless --disable-gpu --print-to-pdf=pdfFolderPath/email.pdf htmlFolderPath/email.html

Unfortunately relative links in HTML file are automatically converted to absolute path. Searching Internet, it appears that it is known issue. URI links don't seem to support links relative to the current/document directory.

URL links relative to the current/document directory are supported in HTML document.

It is possible to specify a default URL for all relative links in HTML document, for example:

"<base href=\"file://C:/Export/AttachmentCache/\" />";

But that is not helping to resolve the issue.

I was playing with /URI in PDF file. I updated /URI as follow:

/URI (../AttachmentCache/20090923-030753-0193507%20P1010082.JPG)>>

and was able to view attachments using Adobe Acrobat DC but not Chrome and Edge.

Unfortunately there seems be no solution to support relative links in PDF as they work in HTML.

PDF supports internal links to some parts in PDF document and external links to files residing outside the PDF document.

Support for internal links would work but unfortunately all documents would have to be included into PDF document. This can be done using non free Adobe tools but not Chrome and Edge.

Current v3.0.1.40 supports appending images only at the end of email via "File -> Attachments Config". There is support in HTML to accomplish that.

mikedepetris commented 11 months ago

Thank you zigm for the complete report and research on this issue. This conclusions is really helpful to understand the scope about what can and can't be done to quickly help deciding what to do when you need to extract a selection of mails to be sent to other people that do not use tools like mbox viewer, and need to have access to the attachments too. At the moment the easiest solutions is to "Print to HTM", with relative links pointing to a folder where one put all the attached files. It MAY be useful to have a menu option to obtain it by single click or at least without the need of configuration/options changes and some textual explaination about what is going to be obtained. Thanks again.

zigm commented 10 months ago

A couple of comments.

  1. Looking your initial email I retested absolute/relative links configuration in HTML file. It looks like by default links are set to relative in HTML files and there would be no need to update the links. If you print to HTML, attachments should also be created in AttachmentsCache folder. There seems to be an issue when you explicitly select "File->General Options Config" and select OK when "Relative Inline Attachments File Path" is set. This will update the registry and when you restart MBox Viewer, links will be absolute instead of the relative in HTML file. Solution is to set "Relative Inline Attachments File Path" to true again after restart. This will be fixed in the next release.

  2. Assuming MBox Viewer generates relative links, you need to copy parent folder of PrintCache folder, and both parentFolder/AttachmentsCache and parentFolder/PrintCache folders if you plan to move this data to another location. However, this will copy all files under these folders unless you delete these folders before you print to HTML.

  3. Do you think it would be useful to have an Export option to generate Html and attachments files for selected mails only and html index file to be able traverse selected mails presented as HTML table with columns such as: Date, Subject, From, To.

  4. It would also be helpful to have an option to avoid the target="_blank". I will add this option. User will need to hit backward button to go back to original page.

mikedepetris commented 10 months ago

Thank you zigm for going deeper.

  1. this explains the trouble I had to understand what was happining at my first attempts, if a restart is needed a worning should be given when closing the dialog
  2. I had to understand myself the behavior, maybe some explanatory text can be added somewhere within the functions and in the help file
  3. it is exactly what I would like to have
  4. would be helpful even if one could easly edit the HTML files, much easier then PDF
zigm commented 9 months ago

Released new v1.0.3.41 to address issues you raised. Added export selected mails option when you right click on single or multiple selected mails. You are also able to configure how to open linked documents.

mikedepetris commented 9 months ago

Thanks I quickly tested it and works good. I noticed that what I get the alert -Directory "xxx\ExportCache" exists already... Select No to copy directory to another location-

If I select "No" and point to another location I find myself back at the same alert and nothing seems to have happened.

zigm commented 9 months ago

Thanks for testing and providing feedback. If you select No you need to copy/move the directory to another location. And yes you will be back to the same alert and now you need to select Yes. I suppose after No the MBox Viewer could skip additional check and overwrite the directory. This could be more clear in Help text.

mikedepetris commented 9 months ago

Ah ok now I understand thank you.