Open Mr-Kanister opened 2 years ago
@Mr-Kanister @laurent22 can i work on it ?
Sorry for the late answer. Yes, you can work on it, I can't commit since I do not know this language yet :)
hi @Mr-Kanister, I am not able to reproduce the bug, I am using Okular (pdf viewer)(in ubuntu) though, It is working fine. which pdf viewer you are using cc @laurent22
I am able to successfully reproduce this on Ubuntu 20.04. Copying the link I can see that the PDF is there, but the viewer is not able to open it (offline/online). I'll try to dig deeper into it.
@Mr-Kanister It'd be really helpful if you could mention the PDF Viewer you are using.
@Mr-Kanister @laurent22 Looks like an issue with how Electron implements printing to PDF really since contents.printToPDF()
is called under the hood and everything is handled there so it depends the pdfData
returned in the InterOpServiceHelper
. Assuming the returned pdfData
is not modified after, this seems really an Electron issue more than a Joplin issue. I'll try to find more about the issue and report that soon, maybe with a possible fix (tweaking the options may help maybe?).
hi @kshitij86, can you specifies the pdf viewer you are using.
I really don't think there is any issue with contents.printToPDF()
,I might be wrong, but in different pdf viewer it is behaving differently eg: Okular pdf viewer it is working fine, I guess it depend on PDF Viewer that is is capable to inserting a pdf or not
Okay, Hi!
I have now tested it again with three pdf viewers and got partly different results: on Windows 10 with Sumatra PDF v3.3.3 I cannot open the pdf by clicking, but I can right-click and copy a link, enter it in Firefox and so read the file. Exactly the same happens in Okular v20.12.3 under Debian Bullseye. Again on Windows 10, I can click the link directly in the PDF Annotator and am redirected to Firefox. However, when I open the parent pdf directly in Firefox, there is no link to click or copy.
This pdf link to copy is virtually the plain pdf file and in my individual case is about three million characters long. If I want to paste this somewhere, my PC first has to calculate for a very long time and Firefox becomes altogether very jerky. In Kate, I can't even highlight any of this string without my PC having to calculate for ten seconds.
I think it's very likely that this is not the fault of Joplin but of Electron, but still this behaviour is at least annoying....How should we proceed, is this bug report out of place here?
Greetings!
@Mr-Kanister While I think @laurent22 might be able to better guide us on this issue, it works in some means it may also be how the PDF viewers handle opening the file.
any updates on this issue ?
Can i work on this project to find and remove some bugs.
Can i work on this project to find and remove some bugs.
You can of course but try to understand and replicate the issue first
I was able to regenerate this issue, IMO the most reasonable solution is to make the links appear as hyperlinks, but not clickable, and of course, not include the linked media(file, pdf, images, etc) in the pdf package, what do you think?
IMO the most reasonable solution is to make the links appear as hyperlinks, but not clickable
The PDFs are embedded in the document so ideally if you click on the link it would open that embedded PDF. But I don't know if that's even possible. Maybe the task is to investigate first if it can be done at all. If it cannot, then we shouldn't embed the PDFs to begin with, and indeed disable the links.
Maybe the task is to investigate first if it can be done at all
I'll be working on investigating this, and when I'm done, I'll do a PR, thanks for your time!
Here's what i came up with:
The issue doesn’t originate from Electron’s contents.printToPDF()
function. This function simply prints the content of a web page, The real problem arises during the conversion of a note to HTML, which is then used by printToPDF()
. This conversion takes a note object and transforms it into HTML, including any attached files as raw data.
Example of Embedded Link in a Note:
[fileName.txt](:/xxx)
converted to:
<a data-from-md="" title="_resources/xxx.txt" href="data:text/plain;base64,/*some data*/" download="xxx.txt">fileName.txt</a>
The href
attribute contains data that, when executed in a new browser tab, opens the original file. This behavior causes issues when exporting to HTML or PDF with large attached files, resulting in very large output files.
One approach to address this is to remove the href
attribute if it starts with data:
, This won’t affect images since they are rendered using the <img>
tag.
But this approach then completely removes every attachment (excluding images). I expect PDFs to be included instead! This is supported by PDFs (https://community.adobe.com/t5/acrobat-discussions/embedding-pdf-files-documents-inside-a-adobe-acrobat-pdf/m-p/4674928).
It doesn't! here is a PDF result of my approach, i included a normal link, a link to a file, a PDF, and an image
https://drive.google.com/file/d/1eZjRWzpFKmWsoACxVM3yP-_RdN7-2PWp/view?usp=sharing
the original note contained this:
[abdalah_elhdad_resume_go.pdf](:/b7f40e4e8ce646ceb2bc1d12fb3d2a88)
[summary of lasttime debugging.txt](:/29bfdbf0640c41f28893e2e9952b9777)
fsdfsdfsdffff
![mermaid-1710357604361.png](:/37d15b6759b34fa8802a63afa6a7cf96)
[normalLink](https://www.youtube.com/)
large file
[userguide.pdf](:/613a3e6e370047acbe41875a0da91b24)
...yes and everything but the image and the "normal link" got removed. As a user, I'd want to have the rest included, too.
There must be a way to do so, as it is supported by the pdf format.
It doesn't
But we already know that it doesn't - that's the point of this issue. Now, what can we do about it? What did you try to make embedded PDFs work?
If Adobe Acrobat can do it, maybe there's a way to format the HTML or setup Electron to make it work. Or maybe not, but from your comments it sounds like you tried the existing feature, saw that it doesn't work and didn't try much else.
from your comments it sounds like you tried the existing feature, saw that it doesn't work and didn't try much else.
If you mean me, then yes, I haven't tried anything else. If there really is no alternative, then of course it's also a bug fix to remove the feature.
If you mean me, then yes, I haven't tried anything else
I was actually answering 7adidaz since he's interested in working on this issue.
If Adobe Acrobat can do it, maybe there's a way to format the HTML or setup Electron to make it work. Or maybe not, but from your comments it sounds like you tried the existing feature, saw that it doesn't work and didn't try much else.
I have researched the capability of doing it, i.e. a single PDF file, with a hyperlink, when clicked, opens another PDF file. but IMO and based on the research I did, it's not possible to do it outside an environment like Adobe Acrobat.
I have seen the attached Adobe guide on this, when the region or the link is clicked inside Adobe Acrobat, it opens the attachment, outside it... it doesn't as I show in the demo.
what do you think? should we go with the safe route and just disable attachment links as I did in the PR?
Just curious: Does it work in Firefox?
With the PDF Annotator (that's paid software, I'm happy to test things and report them, so you don't have to buy it...) I can add attachments to pdfs. Those aren't clickable links, but attachments like E-Mail attachments. In Firefox those get displayed in the sidebar:
In Dolphin a pop up appears:
But in SumatraPDF they aren't viewable and Edge isn't displaying them either: https://answers.microsoft.com/en-us/microsoftedge/forum/all/edge-and-pdfs-with-attachements/0d9f4536-6dd7-400c-83f8-1d2066648930
This is the file: Test.pdf
The files outputted currently from Joplin don't show the attached media as processable entities, not in Adobe, Firefox, or Evince, here is an example file: export_w_media.pdf .. try to extract the data attached!
But! the files outputted from Adobe show as attachments in both Firefox and Evince, here is a test file: output_from_adobe.pdf
Evince
is the default document viewer that comes with UbuntuThe attached Test.pdf
shows the attachment in Firefox, Adobe, and Evince.
@Mr-Kanister I hate mentions, but I updated my comment.. sorry I misunderstood you! :)
So quick summary of the current situation:
The first allows to position an area which, when clicked, may (depending on the viewer) guide to this attachment, while the second one only displays them in an "attachment-window" without a positional reference. Both are not supported by all viewers (this was expected by my side as not all viewers display comments/annotations, too).
I was researching this, and I found a lib, that can be used to attach files to pdf, what are the policy here about using another package?
I think ideally we should not simply attach files to pdf but make links to those files from within the document work.
Doing so may require rewriting the whole pdf export logic.
Can you explain what u mean by links to those files
? like the case of Adobe, where when a region is clicked, the linked pdf opens?
In joplin you create a link in your note and clicking on the link opens the document. From a quick glance at the lib that you linked above, it seems to be attaching a file to pdf without creating a link (most likely there is a way to do that as well - that lib seems pretty good)
I get you... yep, this will require more work on extracting the PDFs for sure :)
I will be applying to GSOC this year, I was interested in the "PDF annotations" so I'll include this in my research, and if I get accepted and I have time at the end of the season, I'll work on it!
Environment
Joplin version: 2.6.10 Platform: Windows 10 OS specifics: 21H1
Steps to reproduce
Describe what you expected to happen
The linked pdf file can't be opend. I expected that the linked pdf would open up. If you look at the file size of the exported pdf, the linked pdf is there! Also I read, that the pdf gets created from the html and when I export to html instead, I am able to open the linked pdf.