mileszs / wicked_pdf

PDF generator (from HTML) plugin for Ruby on Rails
http://www.mileszs.com/wicked-pdf-plugin
MIT License
3.54k stars 646 forks source link

Possible to keep the html around instead of deleting it? #948

Closed joshuapinter closed 3 years ago

joshuapinter commented 3 years ago

Issue description

If you are creating a PDF with a header and footer, 3 files are generated in the /tmp directory:

  1. header html
  2. footer html
  3. body html

When an error happens, the header html and the footer html are retained but the body html is deleted immediately. This makes it difficult to try the wkhtmltopdf command manually and/or review the body html to see where any issues may be.

Expected or desired behavior

Do not proactively delete the body html either ever, or at least when an error occurs in the wkhtmltopdf command.

System specifications

wicked_pdf gem version (output of cat Gemfile.lock | grep wicked_pdf):

$ cat Gemfile.lock | grep wicked_pdf
  remote: https://github.com/cntral/wicked_pdf.git
    wicked_pdf (2.1.0)
  wicked_pdf!

wkhtmltopdf version (output of wkhtmltopdf --version):

$ wkhtmltopdf --version
wkhtmltopdf 0.12.6 (with patched qt)

whtmltopdf provider gem and version if one is used:

gem "wkhtmltopdf-binary-edge",     "0.12.6"

platform/distribution and version (e.g. Windows 10 / Ubuntu 16.04 / Heroku cedar):

macOS 10.15.7 and Ubuntu 18.

unixmonkey commented 3 years ago

As a troubleshooting mechanism, you can bundle open wicked_pdf, and comment out this line in pdf_from_string

  def pdf_from_string(string, options = {})
    # method body
  ensure
    string_file.close! if string_file # <- comment this out to keep the tempfile
  end
joshuapinter commented 3 years ago

@unixmonkey Beauty! Any interest in a PR to make this an init option?

joshuapinter commented 3 years ago

Or, just using .close here so the file isn't unlinked immediately, and let the OS clean up the tmp directory itself?

unixmonkey commented 3 years ago

@joshuapinter .close! closes and unlinks (deletes) in the same operation:

https://ruby-doc.org/stdlib-2.5.3/libdoc/tempfile/rdoc/Tempfile.html#method-i-close-21

I'm not against a way of preserving these files for debug inspection somehow. I think that might help people figure out why assets don't adequately render better, and if they could load them in a very old Chrome, it could help reveal why some modern CSS & JS stuff does't work either, I'm just not sure the best way to expose that to users.

Maybe something like render pdf: 'mypdf', keep_tempfiles: true, tempfiles_dir: Rails.root.join('tmp')?

joshuapinter commented 3 years ago

@joshuapinter .close! closes and unlinks (deletes) in the same operation:

https://ruby-doc.org/stdlib-2.5.3/libdoc/tempfile/rdoc/Tempfile.html#method-i-close-21

You bet. That's why I was suggesting using close here instead of close! so that the file is closed but not unlinked. Whether this is by default or by an option is up to you.

I'm not against a way of preserving these files for debug inspection somehow. I think that might help people figure out why assets don't adequately render better, and if they could load them in a very old Chrome, it could help reveal why some modern CSS & JS stuff does't work either, I'm just not sure the best way to expose that to users.

Maybe something like render pdf: 'mypdf', keep_tempfiles: true, tempfiles_dir: Rails.root.join('tmp')?

Yeah, something like that would work. Just to confirm, you're against keeping these temp files by default and letting the OS handle their deletion? Just a note on this, it appears that the generated header html and footer html temp files are not deleted automatically. Just the body html temp file is deleted.

Also, I found this after_action method in pdf_helper.rb when digging deeper:

https://github.com/mileszs/wicked_pdf/blob/master/lib/wicked_pdf/pdf_helper.rb#L76

def clean_temp_files
  return unless defined?(@hf_tempfiles)

  @hf_tempfiles.each(&:close!)
end

Any idea what hf_tempfiles contains and if it's something that needs to be handled as well?

Thanks.

unixmonkey commented 3 years ago

@hf_tempfiles are header and footer tempfiles that are created pre-render, so they need to exist before the wkhtmltopdf command gets run.

We clean them up manually in an after_action because if something in memory still holds the reference, they may never get cleaned up by the OS. Maybe there's a better way to handle that, or maybe it doesn't need handled since the next assignment of @hf_tempfiles overwrites it? Or maybe it needs to be explicitly set to nil.

I'm not sure if I'm remembering something from long ago, or just some brain cobwebs, but I think I remember adding that because someone's disk was filling up with thousands of tempfiles. I also think I've also had a report of tempfiles going missing, but I can't easily tell if that's due to a system issue, or something we aren't properly handling in this code. In any case, I haven't heard either complaint in a very long time.

Feel free to investigate and let me know if just closing seems to be side-effect free when creating multiple pdfs or not.

joshuapinter commented 3 years ago

Will do. Thanks for your input @unixmonkey. I'll continue to test this and see how tempfiles are handled after generating a number of PDFs. I'll let you know how it goes.

Thanks again for giving this attention.