PhilterPaper / Perl-PDF-Builder

Extended version of the popular PDF::API2 Perl-based PDF library for creating, reading, and modifying PDF documents
https://www.catskilltech.com/FreeSW/product/PDF%2DBuilder/title/PDF%3A%3ABuilder/freeSW_full
Other
6 stars 7 forks source link

Sizing of images and XObjects #201

Closed PhilterPaper closed 1 year ago

PhilterPaper commented 1 year ago

@sciurius said on 2023-09-21 (in PDF::API2 68):

It is nice that images and other objects can be placed with a single call:

$gfx->object( $thing, $x, $y, ... )

When I have a $thing that is 100x100 and needs to be scaled 50%, then apparently I still do need to distinguish images (that need $w=50 and $h=50) from other objects (that need $scale_x=0.5 and $scale_y=0.5).

Am I missing something?

I agree that it's annoyingly inconsistent to treat image objects and other XObjects in different ways. The code says that for

$content->object($xo, $x, $y, $scale_x|width, $scale_y|height)

if $xo is an image object, it sees the final two arguments as width and height (in points, I think). Otherwise (it's an XObject), treat the final two arguments as scaling factors against the "natural size" of the object. This does preclude you from scaling an image, or giving an absolute size (in points) of an XObject.

Would it be an improvement to allow an option to tell object() that the final two arguments are a size or a scaling factor?

sizes => 'points'   treat as absolute size in points (as currently done for images)
sizes => 'scale'    treat as scaling factors, default 1.0

Or would something else be better? Note that if 'sizes' is used, I think you could still have optional arguments. Of course, I have no control over what goes into PDF::API2...

I presume this comes about from your SVG-to-PDF converter, where the original source may be an SVG image, but you will end up treating it as an XObject? Conversely, an SVG bar code might technically be an image, but a user probably wouldn't think of it in that manner (same for MathJax codes, but it could go either way for GNUplot graphs).

sciurius commented 1 year ago

Actually, the question comes from ChordPro, as a side-effect from the SVG-to-PDF conversion. Earlier I only had images to deal with, now an 'image' can also be an XObject and I want to treat them alike.

I decided to add a sub to take away the troubles. Also, I find it more intuitive to pass the mandatory arguments ( object, x, y ) as parameters, and the arguments with decent defaults in an %options hash. This also means that leaving out width and height for an image will use the natural image dimensions instead of 72.

add_object.pl.txt

PhilterPaper commented 1 year ago

Yeah, keeping the size|scale as options probably would have been better, but to change that now would break compatibility, which is why I didn't suggest doing it that way. Maybe I'll just add a new method to PDF::Builder similar to your add_object() code. I'll have to think about it before doing anything, and I have lots on my plate right now. With an option, one thing to consider is someone trying to specify an absolute size (points) for x, and scale for y (or vice versa). I don't know if that makes sense to do, or how it should be handled. Perhaps the option should be an array of sizes or scale factors, rather than two separate options times two types.

PhilterPaper commented 1 year ago

Posted by sciurius:

Also confusing is that images have width and height methods, and other objects do not. OTOH XOForms have a bbox method, which images do not have.

Sigh. Yet more complications in trying to get a consistent interface. Perhaps it would be best to leave images and xobjects alone (with their separate interfaces), and just have a common display method as discussed above. I'm not sure there would be any point in overloading the (now separate) image and xobject interfaces just to get one consistent interface -- that ship has either sailed, or sunk in dock. Images and xobjects have long been treated as different things, so at this point I'm not convinced there's any real need to converge their interfaces. (I'm not sure if that was what JV was after, here, or if he was just making an observation about the two interfaces.)

PhilterPaper commented 1 year ago

@sciurius would it be useful to add width and height to XOforms (as an alternative to bbox), and add bbox to images (as an alternative to width and height)? Would that converged interface serve any purpose? I'm assuming that it would be as simple as something along the lines of bbox=(0,0, w,h) and (w,h)=bbox[2..3]. However, if these are not options but required parameters, that could get a little messy.

sciurius commented 1 year ago

Adding a width and height method to XObject::Forms would be nice. Beware that (w,h) is not bbox[2..3]. The bounding box are the coordinates of two diagonal corners of the box.

package PDF::Builder::XObject::Form {
  method width() {
    my @bb = $self->bbox;
    return $bb[2]-$bb[0];
  }
  method height() {
    my @bb = $self->bbox;
    return $bb[3]-$bb[1];
  }
}

For symmetry a bbox method for images. That would be (0,0,w,h).

PhilterPaper commented 1 year ago

Ask, and ye shall receive! This code is not yet in GitHub, but is ready to go...

added to Form: width

$form = $form->width($w)  Set, with LLx = 0 and URx = $w

$form = $form->width($w, 'LL')  Set, with flag to change LLx, preserve URx

$form = $form->width($w, 'UR')  Set, with flag to change URx, preserve LLx

$w = $form->width()  Get

Set or Get the width of the object, as an alternative to the bbox method. The optional setting flag says if either corner's x is to be preserved. The y values are unchanged. If setting, the form object is returned, to permit chaining.

height

$form = $form->height($h)  Set, with LLy = 0 and URy = $h

$form = $form->height($h, 'LL')  Set, with flag to change LLy, preserve URy

$form = $form->height($h, 'UR')  Set, with flag to change URy, preserve LLy

$h = $form=>height()  Get

Set or Get the height of the object, as an alternative to the bbox method. The optional setting flag says if either corner's y is to be preserved. The x values are unchanged. If setting, the form object is returned, to permit chaining.

added to Image: bbox

$image = $image->bbox($LLx,$LLy, $URx,$URy)  Set

$image = $image->bbox(0,0, $w,$h)  Set

(0,0, $w,$h) = $image->bbox()   Get

Set or Get the image dimensions similarly to a form's bounding box. Note that the LL x and y will always be 0 on Get, and on Set will be handled as (0,0, $URx-$LLx,$URy-$LLy). You may wish to do this in order to work with the corners of the image on the page (more or less its bounding box), rather than the zero-origin dimensions, but only the width and height will be preserved! This method is offered as an alternative to the width and height methods. If setting, the image object is returned, to permit chaining.

I wanted to run a few issues by you:

  1. Be aware that the form width() and height() allow you to specify which corner(s) are set, rather than forcing both.
  2. Is a negative width or height meaningful? In the width() and height() form methods, I force them to positive, but I wanted to check if there might be legitimate uses for negative form width or height. I don't check the bbox entries for a resulting negative width or height, but that could be added. Likewise for image.
  3. Likewise, is a 0 width or height ever meaningful? These would have to be errors that just quit the method (possibly with a message). Likewise for image.
  4. Be aware that when setting a width/height/bbox, it returns the calling object so that it can be chained. Would it be better to return the value(s) just set?
  5. I did not implement bbox($w, $h) for Image, as that's not really a bounding box (implied 0,0 LL corner). Would it still be useful to add this?
PhilterPaper commented 1 year ago

Well, that's interesting... I just tried doing some more examples on my code, to test it, and it fails. I only tested images so far -- do you have a good example of resizing a form?

Anyway, using the width() or height() methods to resize an image appears to mangle it -- nothing recognizable. Since images are normally resized in the image() call, is there any point to resizing through width() or height()? bbox() (just added to image) faithfully tracks size changes from width() and height(), and vice-versa, so reading image dimensions seems to work fine. It's just that setting an image's size (by any method) seems to have problems. Can you confirm that resizing an image (via width() or height()) does or does not work?

sciurius commented 1 year ago

On Fri, 29 Sep 2023 09:09:33 -0700, Phil Perry wrote:

Ask, and ye shall receive! This code is not yet in GitHub, but is ready to go...

I think you are making things far too complex. Get is useful, setting form width and height is not meaningful. Actually you prove that by having to add all the variants.

Images are created with e.g.

$object = $pdf->image( $file, %options );

The created object has content and, therefore, dimensions. Setting the width or height is pointless, unless you are performing an actual resize of the image. Not really useful; you can specify the modified dimensions when you place the image. Also you may need additional libraries for image manipulation.

  1. Be aware that the form width() and height() allow you to specify which corner(s) are set, rather than forcing both.

As stated this is too complex to be useful. Setting the bbox is easier and better defined.

  1. Is a negative width or height meaningful?

I'm inclined to say no.

  1. Likewise, is a 0 width or height ever meaningful?

Yet another reason to not use width()/height() to set dimension.

  1. Be aware that when setting a width/height/bbox, it returns the calling object so that it can be chained. Would it be better to return the value(s) just set?

Why return a value that you just set? Chaining is better. Returning the old value is also meaningful.

  1. I did not implement bbox($w, $h) for Image, as that's not really a bounding box (implied 0,0 LL corner). Would it still be useful to add this?

No.

Basically the only meaningful extensions are width() and height() accessors (not mutators) for form objects.

-- Johan

sciurius commented 1 year ago

do you have a good example of resizing a form?

Anyway, using the width() or height() methods to resize an image

As I pointed out, accessors are useful, mutators are not.

PhilterPaper commented 1 year ago

OK, the code (Image width() and height() has long claimed (both in the POD and code) that it can get or set such values (both accessor and mutator). I simply added the ability to use bbox() to do the same thing. It sounds like I should remove the ability of width() and height() (as well as bbox()) to set dimensions, and only get (access) dimensions. Does this sound good? I was wondering if it made any sense to change the dimensions of an image object.

As for form objects, it sounds like you think that they should also be only get (access) values. Analogous to Image, I added width() and height() as an alternative to bbox(). Again, should I remove setting ability?

sciurius commented 1 year ago

The longer I think about it, the more I regret bringing up the subject. I think that the trigger (the $gfx->image(...) deprecation) is wrong and misleading. Images and XFObjects are two different beasts and I think they should be dealt with differently.

An Image is created with content and, hence, dimensions. The width() and height() methods reflect the dimensions. When placed on the page it stretches from the origin of placement (the $x and $y arguments of the object() function) up and to the right. You can pass a width and height to the object function and the image will be scaled (by the PDF engine). And yes, you can pass negative values for width and height so the image will be mirrored.

An Object is a user coordinate space that can contain objects. Its bounding box determines what part of the coordinate space will be visible. Changing the bounding box changes the visible part of the coordinate space but does not scale contents. When placed on the page it stretches from the bottom left bb point to the upper right bb point, and either point can be arbitrary w.r.t. the the origin of placement.

So it is misleading to associate a pseudo bounding box with an image, saying it is ( 0, 0, width, height ), because changing this pseudo bounding box would require content resizing, which is against the concept of bounding box.

On the other hand, there is nothing wrong with a width() and height() method for objects but it has limited advantages. You cannot do much with an object width and height, for fitting and placement you also need the object origin, so you need the bounding box anyway.

So my current mindset is

Therefore I withdraw the requests in https://github.com/PhilterPaper/Perl-PDF-Builder/issues/201#issuecomment-1739367117 and apologize for the amount of energy you may have already spent on it.

PhilterPaper commented 1 year ago

Therefore I withdraw the requests in https://github.com/PhilterPaper/Perl-PDF-Builder/issues/201#issuecomment-1739367117 and apologize for the amount of energy you may have already spent on it.

No problem... we all go off on wild goose chases now and then! At least I learned a little bit more about the code I maintain.

Since the image width() and height() mutators ("Set") appear to break the image, would there be any problem if I force those two be be only accessors ("Get")? It's easy enough to do. Evidently no one has ever tried using them to change the image dimensions. I also tried them with formimage(), and it still scrambled the images.

It sounds like you think the new image bbox() (accessor) is not useful and there's no point in adding it. No problem taking it out. Agree?

Regarding XOform objects, I'm not familiar enough with their usage to say whether or not a bbox() mutator ("Set") works OK. I have used formimage, which seems to be handled much like a regular image. If I understand you, you seem to think that the existing bbox() works OK and both the accessor and mutator should be left alone, but that width() and height() accessors and mutators aren't useful and should not be added -- correct?

Overall, it sounds like the only change in the end will be to remove the image mutators because they don't work, and not to change or add anything else.

I added the object() method to keep compatibility with PDF::API2, but prefer to use image() to output images. As I said before, I'm really not experienced with handling objects (XO), so I can't speak to that.

sciurius commented 1 year ago

Since the image width() and height() mutators ("Set") appear to break the image, would there be any problem if I force those two be be only accessors ("Get")?

Actually I never realised that you could use these to set width/height. And they only change the image width/height internally, not the image. Since the image is a long (width*height) stream of pixels changing either will cause interesting effects. For example, try

$img->width($img->width-1);

But apparently it has always been this way (and people may use this for artistic purposes) so I would advise a clear notice in the docs.

Regarding XOform objects,

For XOForm objects, setting the bbox is a natural and well-defined operation. Adding width and height methods is not really useful.

I added the object() method to keep compatibility with PDF::API2,

Keeping compatibility is good. Please. And I don't think the image() call will go away at any time so it is okay to use this for images.

Looking at the PDF::API2 code for image() and object() there are a few (undocumented) differences but I cannot estimate the impact.

PhilterPaper commented 1 year ago

OK, I will deprecate the image width() and height() "Set" mutators, and eventually remove them if no-one comes up with a good reason to keep them No other changes.

Closing... thanks for the useful input!