gettalong / hexapdf

Versatile PDF creation and manipulation for Ruby
https://hexapdf.gettalong.org
Other
1.21k stars 69 forks source link

Dealing with page rotation #264

Closed brettwgreen closed 11 months ago

brettwgreen commented 11 months ago

I've got this PDF that I'm trying to draw a box on... all of my box coordinates are determined based on orientation of the page as viewed by the user just looking at the document.

However, I've got this one page that has [:Rotate] = 90 that's killing me mentally. I managed to translate and draw the box correctly myself by doing math on the coordinates but feel like there must be an easier way especially when I encounter other degrees of rotation.

Is there a way to do something to work with the page as if it was oriented correctly with (x, y) being bottom left of the page as its seen in a viewer? Because then the next problem I had was adding text on 'top' of the box which, of course, instead of rendering ther way I expected... it read from top to bottom (like, tilt your head to the right to read it).

I tried variations of, for example:

canvas.rotate(-90) do
  box.draw(canvas, lower_left_x, lower_left_y)
end

The last page of the attached document has the rotated page. sample_doc.pdf

Update: A little bit better code example... Here I try and draw two rectangles using method kinda documented here: https://hexapdf.gettalong.org/documentation/api/HexaPDF/Content/Canvas.html#method-i-rotate

def draw_boxes
  file = "/Users/brett/Documents/sample_doc.pdf"
  out = "/Users/brett/Documents/sample_doc_boxes.pdf"
  doc = HexaPDF::Document.open(file)
  page = doc.pages[doc.pages.count-1]
  rectangle = [300, 400, 100, 50] 
  canvas = page.canvas(type: :overlay)
  canvas.stroke_color('red').rectangle(*rectangle).stroke
  canvas.rotate(90) do
    canvas.stroke_color('purple').rectangle(*rectangle).stroke
  end
  doc.write(out)
end

When I run this code I ONLY see the red rectangle... the purple disappears or is drawn outside the boundary of the page or something. In any case, I only see one rectangle.

gettalong commented 11 months ago

The /Rotate key of a page is - more or less - only for viewing a PDF. When a PDF viewer sees that key it draws the content as it normally would but rotates the result. So the content of the page is always drawn inside the crop box of the page as defined by the /CropBox key (or /MediaBox key if the former doesn't exist), without the rotation. And only afterwards rotated.

You can think of the /Rotate key of an easy way for a PDF processor to rotate a page without touching the content of the page itself.

What you could do is to incorporate the /Rotate key information into the content stream and then drawing the boxes. This can be done with the Page#rotate method, i.e.

doc = HexaPDF::Document.open(file)
page = doc.pages[doc.pages.count - 1]
page.rotate(0, flatten: true)
canvas = ...

Since you don't want to add an additional rotation, you would choose 0 for the rotation angle and with flatten: true you specify that /Rotate key should be removed and the information incorporated into the page content.

Does this help?

brettwgreen commented 11 months ago

Just understanding it helps at the very least, so thank you.

Sounds like you're saying to rotate the page "for real" and remove the viewer-only rotation key by using 'flatten'. I can try that tomorrow as that will probably work for my needs and simplify things... does that actually then change the page dimensions and origin to essentially match the viewer perspective?

gettalong commented 11 months ago

Yes, that will adjust all page boxes (media, crop, art, ...), the page content stream as well as all annotations so that they all take the rotation into account.

brettwgreen commented 11 months ago

That worked perfectly for what I'm trying to do. Now just help me understand what page.rotate(0) means? Instinctually, it reads like it won't rotate at all... i.e. 'rotate zero degrees'?!? When in actuality it seems to rotate counterclockwise 90 degrees. That's a bit confusing.

Trying to generalize the code... should it always be page.rotate(0, flatten: false) regardless of Page[:Rotate] key? Or should it be like page.rotate(page[:Rotate]-90, flatten: true) if page[:Rotate].present?

Update: I think I understand... you're not actually rotating the page any further, just 'locking in' the viewer-only page rotation as an actual physical rotation and remapping all the coordinates to that space using flatten: true.

If that is correct, I would always use page.rotate(0, flatten: true) if a page has the :Rotate key

gettalong commented 11 months ago

Tl;dr: Always use page.rotate(0, flatten: true) for your purpose.

The Page#rotate method is used for rotating the page. However, the optional argument flatten allows one to also incorporate the rotation from the page's metadata into the page itself. So it actually serves two purposes.

When you call page.rotate(0) it won't do anything since you are telling HexaPDF that you want to rotate the page by 0 degrees. Please remember that this does take the /Rotate key into account. So if the page is already rotated by e.g. 90 degrees, page.rotate(0) means that you don't want to additionally rotate the page.

If you specify flatten: true, it means that the rotation, after applying the given angle, should be incorporated into the page itself. By leaving the angle as 0, you say that you don't want to apply an additional rotation and you just want the incorporation of the existing rotation into the page.

For example:

gettalong commented 11 months ago

ps. I have updated the API documentation for Page#rotate to contain some of the information of our conversion here.