ralsina / pdfrw

Automatically exported from code.google.com/p/pdfrw
Other
0 stars 0 forks source link

Problems using table of contents (rl) with pdfrw #2

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
I'm not sure this is actually a bug in pdfrw makerl or not, but when I try
to use table of contents together with a template with a pdfrw object in
it, it fails with:

  File "/usr/lib64/python2.6/site-packages/reportlab/pdfbase/pdfdoc.py",
line 852, in format
    raise KeyError, "forward reference to %s not resolved upon final
formatting" % repr(self.name)
KeyError: "forward reference to 'FormXob.pdfrw_3' not resolved upon final
formatting"

I have attached a small test application that draws background.pdf before
anything else and outputs output.pdf.

Original issue reported on code.google.com by asan...@gmail.com on 13 Jan 2010 at 10:26

Attachments:

GoogleCodeExporter commented 9 years ago
For some probably excellent reason which is completely unknown to me, reportlab
treats FormXObjects very specially.  It wants to keep a shortened name around 
and
then prepend 'FormXob.' to it during various operations.  Also, it doesn't use
references -- it uses the shortened name.  That appears to be what is happening 
here.

I think there is support in pdfrw for this, but it requires special handling by 
the
caller.  An example of code that I think handles this properly is at

http://code.google.com/p/rst2pdf/source/browse/trunk/rst2pdf/extensions/vectorpd
f/__init__.py

makerl() will return a reference to 'FormXOb.pdfrw_3' (because that is the real
reference name), when what you really want is not a reference, and not even the
string 'FormXOb.pdfrw_3', but just the string 'pdfrw_3' so something else can 
add the
'FormXob.' on and look up the reference.  To handle just this case, makerl() 
will
save the 'pdfrw_3' name in the original (not reportlab copy) object.

For your code, I think that in beforeDrawpage, you can do this:

    makerl(canvas, self.page_template)  # NOTE: DISCARD THE RESULTS OF THIS CALL!

    ...

    canvas.doForm(self.page_template.rl_xobj_name)  # Use shortened name

I hope this helps.

BTW, it looks like you are writing some nice code.  Do you think you will have
something useful to add to my examples directory at some point :) ?

Regards,
Pat

Original comment by pmaupin on 13 Jan 2010 at 4:28

GoogleCodeExporter commented 9 years ago
Hm, VectorPdf actually has the same issue when trying to use it from a template 
and
using table of contents.. and the change did not work for me, I still get the 
same
error ..

I wonder if those notifys may make it reference it in a different manner .. hm 
..
seems to me that when run through multiple passes the references are no longer 
in
_doc.idToObjectNumberAndVersion on the second+ run, this is probably because
platypus/doctemplate creates a new canvas and document between each pass.

Maybe makerl does not check the current documents cache and only sees if the 
rl_obj
itself has cached its reference.. not sure about this, but if it were made to
recreate its references if it detects it is being used on a different document, 
could
that work?

Oh about the code, thanks, sure anything in particular you want? or should I 
just
generalize the attached code a bit and make a "how to have a pdf as a background
template example" out of it?

Original comment by asan...@gmail.com on 14 Jan 2010 at 2:52

GoogleCodeExporter commented 9 years ago
Hmm, if your original code worked when not doing a TOC, then reportlab must 
have made
things better than in the version I was using when I first wrote the vectorpdf 
stuff.
 Sorry to send you down that rabbit hole.

I've never used the reportlab multi-pass stuff, and didn't think about the fact 
that
that's what is happening when a TOC is generated.  I think Roberto Alsina might 
have
some experience with that, and in any case, the rst2pdf project has automated 
testing
(which pdfrw does not yet have), and that's where I work when I'm developing or
testing most features.  Since you can reproduce the failure with rst2pdf, the 
best
thing would for you to create a simple failing testcase that works in the
rst2pdf/tests/input directory, using one of the preexisting PDFs in that 
directory
tree as a background.  Then we can check it in and it can easily be run with 
the test
runner.

In terms of sample code, I don't have any immediate needs or hard and fast 
rules.  I
would just like lots of little useful examples, so the "how to have a pdf as a
background template example" would be awesome.  If you're interested, I can 
give you
commit rights, and you can check that and/or other examples in to your heart's
content.  (Or if you want to do something bigger than an example, you could 
make a
'tools' directory and put your fancy new 'watermark' tool there.)  Really, it's 
about
making the whole project enticing enough that it gets used enough that the bugs
disappear before they bite *me* :-)

Thanks,
Pat

P.S.  To create and use the rst2pdf test environment:

a) check out rst2pdf from the trunk
b) cd to the top directory and type "python bootstrap.py"
c) then type "bin/buildout"

Now you can cd rst2pdf/tests, and, for example:

./autotest.py input/test_vectorpdf.

Original comment by pmaupin on 14 Jan 2010 at 4:07

GoogleCodeExporter commented 9 years ago
Since the problem can be recreated with rst2pdf, and since I don't yet have an
automated testsuite here, I checked a failing testcase in to rst2pdf:
http://code.google.com/p/rst2pdf/issues/detail?id=263

Original comment by pmaupin on 23 Jan 2010 at 5:34

GoogleCodeExporter commented 9 years ago
Fixed in subversion revision 84.

It turns out that when reportlab does the second pass, it uses brand new canvas
and document objects.  All the PDF objects are kept track of in the document 
object,
so the new document object doesn't know about any objects the old one did.

Thanks for reporting this, and sorry it took so long for me to get a chance to 
debug
it.  Please mark the issue verified if it now works for you.

Original comment by pmaupin on 23 Jan 2010 at 5:10

GoogleCodeExporter commented 9 years ago
Thank you for pdfrw! :)

r84 fixes it for me.

Attached one basic example, which is a cleaned up version of the testcode 
attached
earlier.

Original comment by asan...@gmail.com on 13 Feb 2010 at 12:19

Attachments:

GoogleCodeExporter commented 9 years ago
Verified by original issue author

Original comment by pmaupin on 14 Feb 2010 at 4:16

GoogleCodeExporter commented 9 years ago
Added your demo to the examples page.  Thanks!

Original comment by pmaupin on 14 Feb 2010 at 4:28