theCrag / website

theCrag.com: Add your voice and help guide the development of the world's largest collaborative rock climbing & bouldering platform
https://www.thecrag.com/
111 stars 8 forks source link

fop does not time out it's url resource requests #41

Closed scd closed 13 years ago

scd commented 13 years ago

When producing PDF Crag Guides, FOP does not time out it's url image requests.

This is fine if the website is working well (or not there - fop will just throw an error), but PDF processing locks up if one of the images gets stuck loading and the whole PDF is not generated. You might say that a properly designed website that should not happen.

But in periods of high load when the website starts thrashing this means that the PDFs will block up. Maybe this is a good thing.

Also subtle bugs are possible with the interaction between apache and our application code causing connections to stay open. For example Image Magick can wreck havoc in apache mod perl if you don't explicitly delete any image magick objects you create.

I guess we might be able to make it more robust if we pre-fetch all the url images using something like CURL and then generate the PDF after we have a local copy of the url images.

brendanheywood commented 13 years ago

Why is fop even attempting to get the images via the web, why not ust directly on the file system. Are we talking about topo images and photo images or avatars?

scd commented 13 years ago

csstoxls produces the fop so when you have an image reference in the xhtml, it is passed through to fop.

At the moment no avatars, but I'm sure we will have them in the final version. The montage has a cover sheet has manipulated images which are created dynamically to fit in certain areas - I'm sure there is a better tool to do montages. I looked into using ImageMagick but it is way too complex for my small brain at this stage.

The stars and logo could come from the filesystem I guess.

The popularity images are produced dynamically - could not get css only working.

So there are some static images and some dynamic images which don't exist in the file system. I guess the point is we can turn them all into static images on the filesystem before generating the pdf.

Up to now my concern has been to just make sure that the components could integrate to produce the type of guide book we want.

On 12/05/2011 9:10 AM, brendanheywood wrote:

Why is fop even attempting to get the images via the web, why not ust directly on the file system. Are we talking about topo images and photo images or avatars?

http://thecrag.com/

brendanheywood commented 13 years ago

Does the background stuff call the web page which in turn goes through all the template's or does it call the mason template directly without going via the network? The latter would be much better for performance,the former better for quick debug cycles. We should be able to put file:///blah/static/img.gif urls into the xhtml and avoid the network for at least the static images.

scd commented 13 years ago

On 12/05/2011 11:40 AM, brendanheywood wrote:

Does the background stuff call the web page which in turn goes through all the template's or does it call the mason template directly without going via the network? The latter would be much better for performance,the former better for quick debug cycles. We should be able to put file:///blah/static/img.gif urls into the xhtml and avoid the network for at least the static images.

I'm not really sure what you mean above. The process is as follows:

  • The backend calls the xhtml template, which produces a single xhtml document equivalent to crag guide html
  • The backend passes that into csstoxls
  • does some post processing to tidy up some systemic problems I could not get around
  • then passes the file to fop which produces the pdf

And yes we could do the static images as you suggest, but it does not solve the problem with the dynamic images, which we could solve with a temporary directory and some pre-processing.

http://thecrag.com/

scd commented 13 years ago

I have made a minor change so that the static stuff (logo, stars, and topo images) come from the file system.

The more dynamic side still has to be resolved: popularity images montage images avatar images (not currently being used, but probably will be in the final version)

Maybe we can turn them into static images on the fly somehow. The only dynamic image which may actually change for a given url is the avatar image

brendanheywood commented 13 years ago

Has that had a significant speed increase to the build?

For now I reckon we can live without the avatar images.

On Wed, May 18, 2011 at 11:52 AM, scd < reply@reply.github.com>wrote:

I have made a minor change so that the static stuff (logo, stars, and topo images) come from the file system.

The more dynamic side still has to be resolved: popularity images montage images avatar images (not currently being used, but probably will be in the final version)

Maybe we can turn them into static images on the fly somehow. The only dynamic image which may actually change for a given url is the avatar image

Reply to this email directly or view it on GitHub: https://github.com/theCrag/website/issues/41#comment_1194433

cheers Brendan

http://thecrag.com/ http://github.com/brendanheywood

scd commented 13 years ago

The bulk of the images are dynamic so the full speed increase won't come in until we have solved the dynamic image issue.

On 18/05/2011 3:38 PM, brendanheywood wrote:

Has that had a significant speed increase to the build?

For now I reckon we can live without the avatar images.

On Wed, May 18, 2011 at 11:52 AM, scd< reply@reply.github.com>wrote:

I have made a minor change so that the static stuff (logo, stars, and topo images) come from the file system.

The more dynamic side still has to be resolved: popularity images montage images avatar images (not currently being used, but probably will be in the final version)

Maybe we can turn them into static images on the fly somehow. The only dynamic image which may actually change for a given url is the avatar image

Reply to this email directly or view it on GitHub: https://github.com/theCrag/website/issues/41#comment_1194433

http://thecrag.com/