Closed felle9900 closed 1 year ago
you should have a look at https://github.com/py-pdf/PyPDF2/issues/558#issuecomment-1138731441
Well the #558 just mentions the the code I've already listed. It don't work in a loop
Maybe this can explain it a bit better:
# x1, y1, x2, y2 of impositions in Milimeter. Real list have 16 sets of coords.
all_coords = [[47.5, 42.5, 132.5, 97.5], [137.5, 42.5, 222.5, 97.5], [227.5, 42.5, 312.5, 97.5]]
# SRA3 sheet (450 x 320 millimeter)
reader_base = PdfReader("test_files/Blank_sheet_450x320.pdf")
page_base = reader_base.pages[0]
# businesscard to be placed many times on the big sheet.
reader = PdfReader("businesscard.pdf")
for coord in all_coords:
page_box = reader.pages[0]
x1 = int(points(coord[0])) # temp set as int to not upset adobe acrobat
y1 = int(points(coord[1]))
x2 = int(points(coord[2]))
y2 = int(points(coord[3]))
page_box.add_transformation(Transformation().rotate(0).translate(tx=x1, ty=y1))
page_base.merge_page(page_box)
writer = PdfWriter()
writer.add_page(page_base)
with open("Merge_test.pdf", "wb") as fp:
writer.write(fp)
Ok I solved my problem.
My solution was to keep changing the translate(tx, ty) in each loop.
# First imposition
if i == 0:
column = points(coord[0]) - media_trim_diff
row = points(coord[1]) - media_trim_diff
# First imposition in a NEW row
elif i % COLUMNS == 0:
column = -3 * (TRIM_WIDTH + GAP)
row = TRIM_HEIGHT + GAP
# all the rest
else:
column = TRIM_WIDTH + GAP
row = points(0)
# page_box.add_transformation(Transformation().rotate(0).translate(tx=column, ty=row))
After that I move the trimbox because that's the only thing that does not get moved with the translate() automatically.
For some reason it breaks if i want to use different page numbers to impose.
just note that the add_transformation will modify page_box so each transformation needs to be relative to previous one
For some reason it breaks if i want to use different page numbers to impose.
can you please clarify
Yes so im placing the same pdf (a businesscard) on a bigger pdf. Everything is fine as long as its the same page of the businesscard im placing. The placement (transmute) is behaving as expecting, and the trimbox is also behaving right.
But if I mix the pages (not the same page of the pdf), the pages are then moved way off like there's something not resetting right.
from PyPDF2 import PdfReader, PdfWriter, Transformation
def mm(my_input):
output = round(my_input / 72 * 25.4, 1)
return int(output)
def points(my_input):
output = my_input * 2.83464567
return output
GAP = points(5)
COLUMNS = 4
ROWS = 4
TRIM_WIDTH = points(85)
TRIM_HEIGHT = points(55)
# x1, y1, x2, y2, scale, page_nr (index_nr)
all_coords = [
[47.5, 42.5, 132.5, 97.5, 1, 0],
[137.5, 42.5, 222.5, 97.5, 1, 0],
[227.5, 42.5, 312.5, 97.5, 1, 0],
[317.5, 42.5, 402.5, 97.5, 1, 0],
[47.5, 102.5, 132.5, 157.5, 1, 0],
[137.5, 102.5, 222.5, 157.5, 1, 0],
[227.5, 102.5, 312.5, 157.5, 1, 0],
[317.5, 102.5, 402.5, 157.5, 1, 0],
[47.5, 162.5, 132.5, 217.5, 1, 0],
[137.5, 162.5, 222.5, 217.5, 1, 0],
[227.5, 162.5, 312.5, 217.5, 1, 0],
[317.5, 162.5, 402.5, 217.5, 1, 0],
[47.5, 222.5, 132.5, 277.5, 1, 0],
[137.5, 222.5, 222.5, 277.5, 1, 0],
[227.5, 222.5, 312.5, 277.5, 1, 0],
[317.5, 222.5, 402.5, 277.5, 0, 0]
]
# big sheet
reader_base = PdfReader("test_files/Blank_sheet_450x320.pdf")
page_base = reader_base.pages[0]
# pdf to impose on the big sheet
reader = PdfReader("test_files/Mobildisko-visitkort.pdf")
# difference between the imposed mediabox and trimbox
media_trim_diff = float((reader.pages[0].mediabox.right - reader.pages[0].trimbox.right))
# trimbox needs to be expanded 2.5 mm on all 4 sides after been moved, so we can se the cropmarks for cutting
trimbox_expanding = int(points(2.5))
for i, coord in enumerate(all_coords):
page_box = reader.pages[0]
x1 = points(coord[0])
y1 = points(coord[1])
x2 = points(coord[2])
y2 = points(coord[3])
# First imposition
if i == 0:
column = points(coord[0]) - media_trim_diff
row = points(coord[1]) - media_trim_diff
# First imposition in a NEW row
elif i % COLUMNS == 0:
column = -3 * (TRIM_WIDTH + GAP)
row = TRIM_HEIGHT + GAP
# all the rest
else:
column = TRIM_WIDTH + GAP
row = points(0)
# move the mediabox and most of the content it is placed correctly, but the viewbox needs to be moved (trimbox)
page_box.add_transformation(Transformation().rotate(0).translate(tx=column, ty=row))
# move the trimbox before the expanding
if GAP == points(0):
# This is currently not used/working atm
print("GAP is 0")
page_box.trimbox.left = x1# - (media_trim_diff / 2)
page_box.trimbox.bottom = y1# - (media_trim_diff / 2)
page_box.trimbox.right = x2# - (media_trim_diff / 2)
page_box.trimbox.top = y2# - (media_trim_diff / 2)
if GAP == points(5):
# this is working
# moving the trimbox
print("GAP is 5 millimeter")
page_box.trimbox.left = x1
page_box.trimbox.bottom = y1
page_box.trimbox.right = x2
page_box.trimbox.top = y2
# expanding the trimbox
page_box.trimbox.left = float(page_box.trimbox.left - trimbox_expanding)
page_box.trimbox.bottom = float(page_box.trimbox.bottom - trimbox_expanding)
page_box.trimbox.right = float(page_box.trimbox.right + trimbox_expanding)
page_box.trimbox.top = float(page_box.trimbox.top + trimbox_expanding)
page_base.merge_page(page_box)
# Write the result back
writer = PdfWriter()
writer.add_page(page_base)
with open("Merged_translated_rotated.pdf", "wb") as fp:
writer.write(fp)
can you provide your failing please blank page
Here is the pdf files I use: Blank_sheet_450x320.pdf Mobildisko-visitkort.pdf
The code I posted above should work and create a 4column, 4 row pdf.
If you change the following code:
page_box = reader.pages[0]
to this code:
if i % 2 == 0: # every 2nd loop
page_box = reader.pages[0]
else:
page_box = reader.pages[1]
Id should now be messed up. but when you look in outline mode in adobe illustrator you can se it will place the correct pdf pages, but the placement (mediabox) is wrong.
@MartinThoma / @MasterOdin,
Looking at this usecase, reintroducing the mergeTransformedPage
(renamed into merge_transformed_page
) sounds as the best option. Your opinion ?
Huh, interesting. I don't understand yet why the issue occurs. It sounds like a bug and thus it would be preferable to fix it. But re-introducing the old (working) functions as an intermediate solution would be OK to me.
We would need to document that issue for the new functions though
Any news on this problem ?
lost in the fifo... will come back on it this week-end
Still no update?
Could we please reintroduce the "mergeRotatedTranslatedPage" class and make it take normal cords and not the tx, ty.
The current functionality breaks when I try to rotate or try mix page numbers. Please I'm stuck with the current classes - It used to work so good before.
It's hard for me to understand the issue as the information is scattered in this thread.
Could you maybe adjust the first comment in this ticket to contain all the information?
A great bug ticket follows this pattern:
1. What I did (as short as possible, but complete - including the full code necessary to re-produce, the PDF used as input, and the versions of the all libraries being used)
2. What I wanted to achieve
3. What happened instead
4. For this issue: The latest version of PyPDF2 that worked as you expected with the same code as mentioned in (1)
5. Really awesome would be a test that fails for the new (broken) code and works with the old code
I'm open to a PR re-introducing mergeTransformedPage
with the old way it worked for as long as this issue exists. But I need a way to check if it (still) exists so that we can deprecate it at some point.
@MartinThoma the PR is in progress should come soone
@felle9900, you should be able to test the PR here is an code example
import pypdf
r1=pypdf.PdfReader("resources/labeled-edges-center-image.pdf")
w = pypdf.PdfWriter()
r2=pypdf.PdfReader("resources/box.pdf")
w.append(r1) # to add the page
w.pages[0].merge_transformed_page(r2.pages[0],pypdf.Transformation().scale(2).rotate(45).translate(100,100),False,False)
w.pages[0].merge_transformed_page(r2.pages[0],pypdf.Transformation().scale(2).rotate(45).translate(200,200),False,False)
w.write("output.pdf")
still some clean-up (mypy) and testing to be done
I've just upgraded pypdf to version 3.3.0 to test that code.
It tells me: AttributeError: 'PageObject' object has no attribute 'merge_transformed_page'. Did you mean: 'mergeTransformedPage'?
Did I miss anything ? ( I used my own pdf files)
you have to copy the modifed files from the PR
Hmm can't se a ez way to download the 9 files. I'm not gonna go thru it manually so I think Ill just wait for it to be implemented. Thanks a lot for the work.
@felle9900 It is implemented in #1567 . We just need somebody to check if it worked as expected.
You can do it like this:
# Go into a clean directory
mkdir issue-1426
cd issue-1426
#... add your script in the directory
# Create and load a virtual environment:
python -m venv venv
source venv/bin/activate
# get the modified code
git clone https://github.com/pubpub-zz/PyPDF2.git
cd PyPDF2
git checkout -b pubpub-zz-merge_trsf_page main
git pull git@github.com:pubpub-zz/PyPDF2.git merge_trsf_page
# Install the modified version
pip install -e .
# Execute your script
I tried to follow along but the line:
git pull git@github.com:pubpub-zz/PyPDF2.git merge_trsf_page
made an error in my terminal ending with:
git@github.com: Permission denied (publickey). fatal: Could not read from remote repository. Please make sure you have the correct access rights and the repository exists.
Uh, right, you need the https URL instead of the git one
I did the venv to clone the PyPDF2 in the directory (should it not be pypdf btw?).
Now the code don't recognize "pypdf" so i changed them to "PyPDF2", but I then get a error that PyPDF2 does not have a method called "merge_transformed_page"
Ok now I got it working - why is the placed pdf pulled in cropped to the trim box? shouldn't it import the whole media box size?
Translate 0,0 does not seem to be respected, it places it at bit further in. = maybe it uses the media box coord
scaling, rotation and using different pages all works
Just opened in illustrator to watch the merged pdf in outline mode: the first pdf i place as translate(0,0) gets placed via the mediabox at 0,0, but because the pdf is cropped visually via the trim box, it looks like its not placed at 0,0.
the translate(0,0) means no change but be careful about trim as you've noticed and but it can be due to an offset in the origin. Can you provide an example ? PS : I've fixed a few points. You should check out latest commit
Ok I just redid the steps and got the new one, looks like the same, I've got to screenshots for you. one is the merged pdf file opened in illustrator. The other screen is in outline mode where you can se its the full original pdf that have been placed but are cropped. placed at x=0, y=0 on the pdf according to the mediabox (the biggest box)
it would be nice it the placed pdf was using the whole mediabox or could take a extra arg for cropping. like 7.41 points extra than the trimbox as it is currently, cropping=0 would just be as it is now, using the trimbox
There is a problem when I'm placing several impositions (businesscard pdf) on my big sheet-pdf. I'm looping over 20 coords I have in a list and calculate the tx and ty for each placement. The tx and ty are correct, but some weird stuff is happening where its not updating correctly so only the first placement is correct the rest is being placed way off to the left of the sheet-pdf, and the following placements on that row are placed on top of each other. It looks very much like the same error as before.
I dislike the idea to add extra parameters: The best for me is to adjust/modify the boxes in the source page before inserting.
I've successfully got this result (requires latest fix): test card: visitcard.pdf
the code
import pypdf
r = PdfReader("visitcard.pdf")
w = pypdf.PdfWriter()
w.add_blank_page(pypdf.PaperSize.A6.width, pypdf.PaperSize.A6.height)
for x in range(4):
for y in range(7):
w.pages[0].merge_translated_page(
r.pages[0],
x * r.pages[0].trimbox[2],
y * r.pages[0].trimbox[3],
True,
True,
)
w.write("tt.pdf")
the output tt.pdf
Ok this is pretty cool, Could I use 450 x 320 mm instead of "A6" somehow ?
I managed to crop my pdf the way i like (trimbox+5mm) = It displays correctly
I also managed to test that it will place different page numbers from the "visitcard" - tested by using randint(0,1)
But there is a big bug I can't get past: The merge_translated_page() uses the mediabox for the translate, even if I change that before translating.
If you swap out your "visitcard.pdf" with my card Mobildisko-visitkort.pdf tt.pdf
At the current state you can't translate less than the mediabox. Maybe that is hardcoded somewhere?
Ok this is pretty cool, Could I use 450 x 320 mm instead of "A6" somehow ?
Not really as all depends on the user_unit
of the document. It's typically 1/72 inch
which is about 0.352806mm.
That means the dimensions you need would be (in default user units): 450/0.352806 ~= 1275 and 320/0.352806 = 907
Ok cool I got it, thanks.
What about the translate bug ?
Take look on this code, there's some weird stuff going on. Only the first imposition is cropped correctly (bottom left) Rest is placed correctly but the cropping is off.
from pypdf import PdfReader, PdfWriter, Transformation, PaperSize
def mm(my_input):
output = round(my_input / 72 * 25.4, 1)
return int(output)
def points(my_input):
output = my_input * 2.83464567
return output
GAP = points(5)
COLUMNS = 4
ROWS = 5
TRIM_WIDTH = points(85)
TRIM_HEIGHT = points(55)
all_page_numbers = [1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
sheet = PdfReader("test_files/Blank_sheet_450x320.pdf")
imposition = PdfReader("test_files/Mobildisko-visitkort.pdf")
# create write object (sheet)
write_object = PdfWriter()
#write_object.append(sheet)
write_object.add_blank_page(PaperSize.A6.width, PaperSize.A6.height)
#write_object.add_blank_page(points(650), points(320))
# difference between the imposition mediabox and trimbox
media_trim_diff = float((imposition.pages[0].mediabox.right - imposition.pages[0].trimbox.right))
# trimbox needs to be expanded 2.5 mm on all 4 sides after been moved, so we can se the cropmarks for cutting
trimbox_expanding = int(points(2.5))
imposition_index = 0
for x in range(COLUMNS):
for y in range(ROWS):
page_nr = all_page_numbers[imposition_index]
print("imposition_index:", imposition_index, "page_nr", page_nr)
# expanding the trimbox
imp_page = imposition.pages[page_nr]
imp_page.trimbox.left = float(imp_page.trimbox.left - trimbox_expanding)
imp_page.trimbox.bottom = float(imp_page.trimbox.bottom - trimbox_expanding)
imp_page.trimbox.right = float(imp_page.trimbox.right + trimbox_expanding)
imp_page.trimbox.top = float(imp_page.trimbox.top + trimbox_expanding)
write_object.pages[0].merge_translated_page(
imp_page,
x * TRIM_WIDTH + trimbox_expanding,# x * imposition.pages[0].trimbox[2]
y * TRIM_HEIGHT + trimbox_expanding,# y * imposition.pages[0].trimbox[3]
True,
True,
)
imposition_index += 1
write_object.write("tt.pdf")
Ok this is pretty cool, Could I use 450 x 320 mm instead of "A6" somehow ?
Not really as all depends on the
user_unit
of the document. It's typically1/72 inch
which is about 0.352806mm. That means the dimensions you need would be (in default user units): 450/0.352806 ~= 1275 and 320/0.352806 = 907
In the test code I've produced, I've set the expand to true : the boxes are expanded : I've used A6 to start with but the final size is far much more bigger
Take look on this code, there's some weird stuff going on. Only the first imposition is cropped correctly (bottom left) Rest is placed correctly but the cropping is off.
I'm confused about your code : Why are you change the trimbox every cycle : you should modify it once and the box is applied.
However reviewing the code I agree that there is something odd (even in the old code) : the cropping is done based on the trim box instead of the crop box (which define the clipping for display and printing) @MartinThoma before commiting the change can you give me your opinion about it ?
I'm sorry, I don't understand the question @pubpub-zz . What do you want to know?
At the current state you can't translate less than the mediabox. Maybe that is hardcoded somewhere? @felle9900 The transformations are not applied to the boxes (mediabox / trimbox / cropbox). That means if you translate the content out of the mediabox, you will no longer see the content.
This behavior is often confusing for people, but I'm uncertain about the best way to improve it. Maybe adding a parameter transform_boxes: bool=False
to add_transformation
? But what would you expect if a translation is happening?
A method fit_boxes_to_content()
might be desirable.
I'm sorry, I don't understand the question @pubpub-zz . What do you want to know?
Currently merge_transformed_page
crops the content to trimbox whereas pdf reference states that the cropping should be done based on cropbox. for me, merge_transformed_page
is buggy. Do you confirm my analysis?
A method
fit_boxes_to_content()
might be desirable.
this my be very tough to implement...😕
The reson im adjusting the trimbox on each cycle is because that can only be done to a specific page on the pdf. That page can be any pagenr at every cycle.
But I was maybe thinking of doing a seperate loop of cropping the businesscard pages so all of the boxes (mediabox/tribox/cropbox) are removed - then the script might work because it just have to place those pages right next to each other. Im gonna go test that out.
I did at test by using a cropped file (cropbox = (trimbox + 5 mm)) - file was cropped in Acrobat.
Works like a charm, se the pdf. I remember I did try this a long time ago but I ran into a problem because the old pypdf2 would not respect the cropping of the pdf file it had made itself (wierd).
Im going to test that part now. tt.pdf
I'm trying to update a older pypdf2 program I've made that can step and repeat several pdf-files on a bigger pdf page.
But the current method seams to work on a bit primitive way:
Keeps translating 50 pt between every pdf im placing. Thats not what Im looking for :)
Any way of doing it the old way with dedicated x1, y,1, x2, y2 values instead? I used to use the mergeRotatedTranslatedPage()