Open DaAwesomeP opened 2 years ago
I have tried to reproduce this issue. The PDF is not oversized with every JPEG of those dimensions or above, but there are certain JPEGs that cause this issue of oversizing. I have tried 2 JPEGs(4000X6000) of 6MB exported PDF is of 82MB. I have also tried with a different JPEG( 4400X6216 ) of 5.88 MB exported PDF is of 8.89MB. can i work on this issue?
can i work on this issue?
sure
can i work on this issue?
sure
can you tell, where can I discuss the issue? like discord, forum or else?
Here, on the forum or on discord.
Not sure what there is to discuss though.
Here, on the forum or on discord.
Since the issue has been reported here, maybe try to keep the discussion contained here? That way the debugging and solution steps remain well documented and searchable, and eventually all that information will be linkable to a pull request. Personally, I am not in the Discord so you will only be able to get a hold of me here.
Im happy to hear it is reproducible! A good place to start with debugging this would be to see how the PDF gets exported, which tools/libraries are used to do it, and what options/flags that tool might have available. I have yet to look into it all at myself.
@DaAwesomeP can you confirm if the JPEGs you were using were from DSLR or pro Camera ?
I have created a topic of this issue on Joplin forum: https://discourse.joplinapp.org/t/pdf-with-jpg-selected-exports-oversized-pdf-github-7314/28419
@DaAwesomeP can you confirm if the JPEGs you were using were from DSLR or pro Camera ?
The images came from a Google Pixel 4a 5G. I'm not certain which settings were enabled on the phone.
Might be related: https://bugs.chromium.org/p/chromium/issues/detail?id=801430 @DaAwesomeP do you have any custom css?
@roman-r-m No, my Joplin is unmodified and installed via Appimage. That issue seems to suggest that EXIF vs JFIF JPEGs may cause a different result if that bug still exists.
I haven't been able to replicate it so far, so can only guess. Any chance you could share one of those huge pdfs?
I haven't been able to replicate it so far, so can only guess. Any chance you could share one of those huge pdfs?
The issue do not appear with every jpeg, but only with certain jpeg.
@roman-r-m OK, Gist with photos (in Gist instead of attaching here to avoid compression/modification) and exported PDFs here: https://gist.github.com/DaAwesomeP/1e2359f73334471184d670f59ec21abc
I can confirm this is an EXIF issue. If I run exiftool -EXIF= original.jpg
on the image first, then the issue goes away and the PDF is the expected size.
In the Gist, the 11.4MB file export_original.pdf
is an export of original.jpg
. The 2.7 MB file export_stripped.pdf
is an export of stripped.jpg
. You can see that the export of the file without EXIF data is effectively the same size as the original image, as expected. Note that in this example the PDF did not balloon to 60+ MB as this is a very simple, mostly white background photo that I took for this issue. More complicated photos definitely get much, much larger.
Please excuse my phone not properly rotating/applying metadata to rotate the image.
I can confirm this is an EXIF issue. If I run
exiftool -EXIF= original.jpg
on the image first, then the issue goes away and the PDF is the expected size.
In this case I'm not sure what can possibly be done on the Joplin side as it relies on Electron/Chrome for creating PDFs.
There was an idea to replace Chrome's built in PDF converter with a 3rd party library but I doubt it's going to be done anytime soon, if at all.
In this case I'm not sure what can possibly be done on the Joplin side as it relies on Electron/Chrome for creating PDFs.
As a temporary workaround, there may be a simple way to remove the EXIF data before exporting. I will test more closely and try to figure out exactly which EXIF fields are causing this issue and propose a lightweight solution (obviously don't want to include something as large as ImageMagick). I think it's fine to remove some EXIF data from exported PDFs, as extracting images from PDFs and expecting the same EXIF data is somewhat niche. Chrome may already remove some of the data in the export process.
There was an idea to replace Chrome's built in PDF converter with a 3rd party library but I doubt it's going to be done anytime soon, if at all.
I can potentially look into this too, but this is obviously a much bigger task.
Perhaps something to report to the Electron repo? We use webContents.printToPDF()
to export to PDF
@laurent22 I began to submit an issue just now, but it seems that Electron v19 is EOL. Maybe updating (if possible) would help to resolve the issue?
I am trying to export a PDF of a note that has two images in it. The images are JPEG and very high resolution (3024x4032), however each is less than 2.4 MB in file size. These photos were taken on a Google Pixel 4a 5G.
I have verified that Joplin copies them with this small file size by right-clicking the images and clicking "reveal in file folder." When I export this note as a PDF, the export is 66 MB. If I try to print to a file using the print menu and the file printer in CUPS, Joplin crashes.
Environment
Joplin version: 2.8.8 Platform: Appimage OS specifics: openSUSE LEAP 15.3 x64
Steps to reproduce
Describe what you expected to happen
PDF should maintain good JPEG compression of source images.
Logfile
Logs shows loading ntoes and syncing but nothing about export except for this line:
I am hesitant to share the log since it contains notes and keys, but I can go through it and redact if necessary.