Closed bimusiek closed 7 years ago
Hi, pls help me understand: what is qr?
You can always invert colors with pixmap method invertIRect(irect)
.
Other than that, users had similar issues when their MuPDF was not generated exclusively and completely based on thirdparty software included in the MuPDF package, i.e. mixtures of libraries on their system and MuPDF.
If you want, you can send me your PDF so I can help investigate.
Please also be explicit with your OS / Python / PyMuPDF version.
Any news on this?
Hey @JorjMcKie , can you share your email? PDF is some old passbook but with our client details so I cannot share it publicly.
I have taken a look at the PDF in the meantime: MuPDF (mutool extract
) and hence PyMuPDF do indeed not correctly reproduce this image (again a barcode). Outcome is instead a pure black image without applying the mask which is also stored in the PDF. In contrast, Nitro PDF for example does this correctly.
I will submit an issue to MuPDF and continue looking in the library for ways out out.
Thanks a lot for investigating 👍
Hi again,
I found out, that mutool extract
indeed does find the barcode image and its masking image, which it treats as an independent pixmap.
I have experimented a little and found the following skript working to re-create the original barcode image with correct coloring.
In what follows, test.pdf
is your anonymized PDF, containing the barcode image as PDF object number 6. Object number 7 is the corresponding SMask
image.
What this script does is creating a RGBA samples
area, where the RGB values are taken from pixmap of image 6 and the alpha values taken from pixmap object 7 samples.
When saving the pixmap created from this new samples (bytearray called ba
) as test.png
, it shows the original.
Thanks to your inquiry, I now have a few things to extend PyMuPDF with ... :-)
@bimusiek I have added functionality to solve your problem - hopefully in a fairly elegant way:
Page.getImageList()
and Document.getPageImageList()
to provide the xref of the soft-image mask /SMask
parameter as the second item of each image entry (it was useless before anyway). If this number is positive, then the image itself should be specially treated for creating PNGs.[xref, smask,..., "Im2"]
is an entry of getPageImageList()
and smask is positive, then the following statements should create the original Im2
png:pix1 = fitz.Pixmap(doc, xref) # pixmap without alpha channel
pix2 = fitz.Pixmap(doc, smask) # this contains the alpha values in its samples
pix3 = fitz.Pixmap(pix1) # copy of pix1 with alpha channel added (new constructor)
pix3.setAlpha(pix2.samples) # fill its alpha values with pix2 samples (new method)
pix3.writePNG("Im2.png") # this should look right now!
Pixmap pix3 should now reflect the original image - and be what you were missing ... let me know your experiences.
Hey, I have a pdf that contains qr code. However, when reading it the black and white is reversed (thus I cannot read qr).
Any idea how can I know to invert colors?