wmorgan / heliotrope

A personal, threaded, search-centric email server.
124 stars 17 forks source link

html2text timeout #19

Closed eaon closed 12 years ago

eaon commented 13 years ago

Hi,

Just stumbled on heliotrope, love the idea (had similiar thoughts recently) and thought I'd give it a go. Discovered that one email in particular from my local Maildir caused html2text to hang during import. A timeout for how long it would take to run html2text would probably be useful, causing the email to be skipped as bad.

If that makes sense?

tjheeta commented 13 years ago

I had the same problem with large messages, so I ran in a separate console: while [ 1 ] ; do killall html2text ; sleep 30 ; done

Not a ruby guy, but I don't see a way to get the subprocess id of html2text with popen3. I'm not sure if we need stderr, so maybe conversion to IO.popen would be fine?

wmorgan commented 13 years ago

I have a solution to this that works on Ruby 1.8 with the system_timeout gem, but that gem doesn't compile (and is supposedly unnecessary) on Ruby 1.9. The stdlib 'timeout' gem is supposed to work fine on 1.9, and works for the simple case of Timeout::timeout(5) { system "sleep 10" }, but doesn't work when I call popen3 and read from stdin. This might be a bug in popen3 in Ruby 1.9.

At any rate, you can get the pid via a fourth argument to the block in open3 (see http://ruby-doc.org/stdlib/libdoc/open3/rdoc/classes/Open3.html#M001644), so killing the process might be a solution. I'll continue to play around.

wmorgan commented 12 years ago

This is a regression in the open3 stdlib between Ruby 1.8 and 1.9. I've filed a bug on Redmine. In the meantime, I've backported the 1.8 implementation directly into Heliotrope as a workaround. So this should be fixed now.