pmaupin / pdfrw

pdfrw is a pure Python library that reads and writes PDFs
Other
1.86k stars 271 forks source link

AddPage in with pdfrw #122

Open mnimkar opened 6 years ago

mnimkar commented 6 years ago

I wanted to write new page in pdf file. How we can create new pdfDict with '/Type': '/Page' I written code which will give me list of xObjects. I want to create one pdf file which contains only xObjects which I took in one list.

pmaupin commented 6 years ago
x = PdfDict()
x.Type = PdfName.Page
mnimkar commented 6 years ago

Thanks for reply. I have another doubt that, I am creating one page to store xObject. After writing x.Type = PdfName.Page will right on next line as x.Content = xObject_Instance Correct? If I want to create page with text will write following code Please correct me If I am wrong,

 x = PdfDict()
 x.Type = PdfName.Page
 x.Contents = 'String or Text written in pdf file'
 pageList.append(x)
 writter.addpages(pageList)
 writter.write()

Please let us know which are the important keys to create new page. Thanks In Advance

pmaupin commented 6 years ago

a) It wouldn't be upper-case x.Contents -- that would create a /Contents key rather than a content stream.

b) It would be x.stream = 'your stream here'.

c) 'your stream here' cannot simply consist of text that goes on the page. That's not how PDFs work.

d) pdfrw is designed to manipulate PDFs, not create them from scratch. You might be better off starting with rst2pdf or weasyprint.

That's as much as I can help you at the moment -- I'm very busy, and digesting the Adobe PDF spec for you is not a task I am prepared to undertake.

Michael-Pascale commented 5 years ago

Hello, I am having a similar issue. I want to make a couple of blank pages at the end of the pdf for better formatting. It can make a new pdf with the blank pages with no visible issues, but when I try to run it through something like the 4up example, it won't work. Below is the traceback python provides:

Traceback (most recent call last): File "pdf_condense.py", line 71, in writer.addpage(get4(pages[index:index + 4])) File "pdf_condense.py", line 32, in get4 srcpages = PageMerge() + srcpages#add all of the pages onto one pagemerge and File "C:\Users\thebi\Documents\Backed Up\src\python\PDF_Scripts\pdf_condense\pdfrw\pagemerge.py", line 164, in add self.add(other) File "C:\Users\thebi\Documents\Backed Up\src\python\PDF_Scripts\pdf_condense\pdfrw\pagemerge.py", line 171, in add obj = RectXObj(obj) File "C:\Users\thebi\Documents\Backed Up\src\python\PDF_Scripts\pdf_condense\pdfrw\pagemerge.py", line 52, in init base = pagexobj(page, viewinfo) File "C:\Users\thebi\Documents\Backed Up\src\python\PDF_Scripts\pdf_condense\pdfrw\buildxobj.py", line 292, in pagexobj mbox, bbox = getrects(inheritable, viewinfo, rotation) File "C:\Users\thebi\Documents\Backed Up\src\python\PDF_Scripts\pdf_condense\pdfrw\buildxobj.py", line 141, in getrects mbox = tuple([float(x) for x in inheritable.MediaBox]) TypeError: 'NoneType' object is not iterable

The code I used to make the new pages is what was provided here. Is there something obvious missing? Thanks!

Michael-Pascale commented 5 years ago

A quick update, I have added some code which fixes the problem but I have doubts to reliability and such. In pagexobj I added the following code preceding the line calling getrects():

if not inheritable.MediaBox: inheritable.MediaBox = ['0', '0', '612', '792']

I am well aware this is not a GOOD fix for the long run, but for now at least it works. If there is a better way, please let me know!