jung-kurt / gofpdf

A PDF document generator with high level support for text, drawing and images
http://godoc.org/github.com/jung-kurt/gofpdf
MIT License
4.34k stars 787 forks source link

Decode Bug #218

Closed d1ngd0 closed 5 years ago

d1ngd0 commented 5 years ago

I realized last night that I introduced a bug with the addition of Encoding and Decoding templates. When a template is created it is given an ID by the GenerateTemplateID method. This worked in the past because a template could not exist outside the context of it's original gofpdf package. However now that templates can be saved to files, or templates can be generated across a cluster of machines maintaining the original ID for a template will quickly create Template ID conflicts where many Templates claim to have an ID of 1. I'll start working on a PR to fix this, but figured I should bring it to your attention.

Thanks!

jung-kurt commented 5 years ago

Good catch. Can a unique ID be assigned when the stored template is loaded?

d1ngd0 commented 5 years ago

That was my plan, I am going to keep the solution limited to the Encode and Decode functions. Essentially strip away the template id when encoding, and add a new one back in when decoding. Then we won't have clashing anymore.

d1ngd0 commented 5 years ago

https://github.com/d1ngd0/gofpdf/commit/a94a239047daf957e399870de0ba56ef69dcc702

@jung-kurt and @marcus-downing (sorry to call on you again) hopefully you can give me a little insight here because from the standpoint of the Template this should work. I added the example go program to illustrate how this doesn't work, though obviously the PR won't have that in there...

What I am seeing right now is that images in the parent template will show, however images any deeper do not. If I change the Encode and Decode methods to maintain the templates ID the images in children templates appear again. I am starting to worry that the Template ID is referenced somewhere within bytes. Any help would be appreciated.

d1ngd0 commented 5 years ago

Also, I was curious if the ID had to be a number? I changed it to a string locally and the pdf still rendered. I'm just curious if that is valid. If so would it make sense to have a default Template generator function, but allow the user to supply their own Template generator? That way people could use whatever id generation library they wanted without trying to build one internally or pulling in a dependency of some kind.

marcus-downing commented 5 years ago

A string or struct would provide a way of namespacing images to avoid any risk of a clash. They could then be flattened to something else on PDF output.

type ImageID struct {
    TemplateID int64,
    ImageID int64
}
marcus-downing commented 5 years ago

I'm pretty certain that template ID is not output anywhere in the resulting PDF.

marcus-downing commented 5 years ago

And immediately after posting that, I saw that I was wrong 😄

marcus-downing commented 5 years ago

However, GenerateTemplateID is global (and thread safe, since it's channel based), so even if you use templates inside templates, it'll never have a clash of template ID.

d1ngd0 commented 5 years ago

Right, however with the addition of Encoding and Decoding templates, a template can now live outside of the gofpdf package and be reloaded where GenerateTemplateID's state would be reset. This means if I create a template, and save it to a file using encode, and then restart my program and create another template that uses that template loaded from a file, both will have a template ID of 1.

I tried fixing this by making the Encode method not encode the ID of the template, and then the decode method uses GenerateTemplateID when loading it in... however the problem seems to be with the UseTemplate method, which references the ID of the template within a template... therefore when I decode the template, while the state of the template IDs do not clash, the actual stream of the parent template object will still reference the old id. Hopefully that all makes sense...

I only tested images, but I doubt, considering the inner template reference isn't right, that anything from the inner templates would show after encoding and decoding the parent.

So back to my initial question, is it valid PDF if the template id is a string? If so I could add a register function for GenerateTemplateID so that you could specify how you wanted the Id to be generated. This means the current definition could stay the same if the register endpoint was never hit, and the user could then specify how to generate the Id using whatever 3rd party libraries they wanted, meaning we don't have to pull in a dependency. Again, hopefully that makes sense.

d1ngd0 commented 5 years ago

Alright, I did some additional research on named objects and found that changing the id to a string won't stop the pdf from being valid. I have put some more work into a solution and would love to hear what you both think. I have added a prefix to the id, if you don't add a prefix the template id structure will remain the same as before.

https://github.com/d1ngd0/gofpdf/commit/8bc0bd27bc4456444485071164b8b42bd050f1bd

d1ngd0 commented 5 years ago

Actually, after thinking about this. Couldn't we just make the ID method return the sha1 sum of the Bytes method?

jung-kurt commented 5 years ago

Actually, after thinking about this. Couldn't we just make the ID method return the sha1 sum of the Bytes method?

Great idea!

d1ngd0 commented 5 years ago

Added a Pull request (#219). If you have any questions let me know.

jung-kurt commented 5 years ago

Committed: 580543bbbf6ba6a65b8aa9cdbf7d15069d651cb5

d1ngd0 commented 5 years ago

Thanks! It looks like fonts might be having the same issue. I am looking into a solution for this as well.

d1ngd0 commented 5 years ago

220

This should fix it.

jung-kurt commented 5 years ago

Thanks for all of the good work you have done, @d1ngd0. Committed: 55a774389a811b454a9ee3dfa78bd28fc5d0ab18