sillsdev / ptx2pdf

XeTeX based macro package for typesetting USFM formatted (Paratext output) scripture files
23 stars 8 forks source link

Extraneous lines are printing on figures #244

Closed KimB2017 closed 3 years ago

KimB2017 commented 4 years ago

In 1.2.9 I noticed today that my illustrations all have two types of unwanted "borders" as printed in the PDF: (1) a light grey border on all four edges, and (2) a heavier black border on either top and/or bottom or one or both sides. This heavier border is never on more than two opposing sides.

Last month I successfully printed a rush job and later discovered that my ptxprint installation was incomplete (remember? due to having manually installed early on and then installing the .deb on top). At that time I saw the black border on 1-2 sides appear when I invoked "smaller" images from the project's figures folder. But when I used the full-size images there were no borders. So is this a regression?

FYI--here's a God thing! If I had had the current version of PTX Print that week I would not have succeeded in meeting a print shop deadline for John's Gospel to be taken to a remote community in PNG before the expat advisor went home for furlough! After I purged all the old files and installed ptxprint again the following week I was unable to recreate that JHN book until just today... and today it still has these ugly border lines.

Selection_007 Selection_008

davidg-sil commented 4 years ago

Can you check the images in a picture viewer/editor that will show you every pixel? We have had issues in the past with some "stock" pictures having black lines on the first/last pixel in the image file, which some viewers 'automatically cleaned", but which XeTeX just reproduces faithfully.

If the smaller pictures were generated by ptxprint, there's obviously a bug...

mhosken commented 3 years ago

We have seen that a number of pictures actually do have the border lines in the picture itself. It can be hard to see this in a viewer with a black background. If you can, try viewing the images in a viewer that allows you to change the background colour.

mhosken commented 3 years ago

Do we want to provide some kind of crop capability for pictures, to remove the lines?

markpenny commented 3 years ago

Yes, I think we'll need an (advanced?) feature to crop (approx) 3pixels from the border of (all?) images. I'm guessing we can do this in the python code (in the same place where we convert tif to jpg).

On Sat, Oct 24, 2020, 4:15 PM mhosken notifications@github.com wrote:

Do we want to provide some kind of crop capability for pictures, to remove the lines?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/sillsdev/ptx2pdf/issues/244#issuecomment-715664532, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADC4XFGLG25QADZKNIB3V4LSMJBE7ANCNFSM4SWZ3B6Q .

KimB2017 commented 3 years ago

I inspected HK00097B.TIF (1996 file date so it's not something recently introduced). I could see no line on any side in GIMP–zoomed way in, or in Wasta's default Image Viewer–with the background set to white. But the PNG file shows a fairly heavy line on the pitchfork-side of the image (it's flipped here but the line is present when it's not flipped as well). Can you see a line in the image file using another viewer?

I can't attach a TIF file here in github; I used GIMP to export it as PDF and as JPG and I don't see a line in those either.

HK00097B.PDF

hi-res fig in pdf

davidg-sil commented 3 years ago

I'm also seeing a light line on the top, bottome and 'tree-side' of the image. I expect that either the original PNG file was converted using a program with a bug (smoothing with pixels off the edge of the image is a fairly common mistake) or someone wanted to put a frame on the picture. Neither of these issues are the fault of PTXprint / XeTeX, and I don't know how reasonable it would be for PTXprint to provide a solution.

MP: regarding 'crop 3 pixels', depending on the image, that might break something. Some images I've seen have details right to the edge of the image. To my mind, this is a one-time operation best done in an external program, on a case-by-case basis. Alternatively, it could be done by PTXprint as a special one-off operation, but again, it would have to be file-specific Possibly it would be part of a 'image twealing' dialogue that would allow making lower resolution images, from a high resolution master, or converting colour to greyscale by picking from several grey-scale masks (generic ones may not be good, if the image contrasts in hue but not brightness). I'd highly recommend that the feature-set be kept very limited, and tooltips suggest gimp / other free tools for more complex operations, like colour -> line art transformations.

markpenny commented 3 years ago

The primary users that I have in mind for PTXprint have enough trouble locating images on their system to include in scripture (i.e. they have very basic skills). They wouldn't have the skills (or time) to download and install a graphics editing system, and tweak and re-save the images without the offending border. So if we are going to help them, then it needs to just happen for them automatically (or at worst, only after the checking of an option in PTXprint). Given that this is primarily happening in low-resolution (badly converted) .jpg files, I don't think we need to worry too much about cropping 1,2 or 3 pixels off the edges of images - especially if that option can be disabled. But I would need a hand (from @mhosken) to do so in our Python code. Presumably it would happen at the same time that we pad the sides of images so that they fit wider spaces. The sad thing is that these lines never used to appear in PT7, but at one point the XeTeX code which ships with Paratext got updated (just before PT8 was released, IIRC) and then they started to appear. So the "XeTeX improvement" ended up being a regression as far as picture handling goes. Part of me would like to figure out which underlying library got changed and why, and go back to how it was - but that's way beyond what I'm capable of.

davidg-sil commented 3 years ago

I didn't thiink the padding was needed these days, not since scale got introduced?? As for it changing... I seem to remember reding that this was a bug in the picture-handling of SOME image types, depending on colour space, that some got cropped and others didn't. i.e it was justfluke. I can't remember what language I did it in, but I once added some auto-trim code to something.... IIRC it looked at two corners, and if they were similar (+/- 2 on r/g/b), it then checked to see if the whole edge was that colour. If so, remove remove edge. But I'm sure therre are existing bits of pyton code to autocrop images "out there" on the net.

mhosken commented 3 years ago

Border cropping is now an option. It will analyse the image and guess where the border lines are and crop the image to remove them.