documentcloud / docsplit

Break Apart Documents into Images, Text, Pages and PDFs
http://documentcloud.github.com/docsplit/
Other
833 stars 214 forks source link

"undefined method `strip' for nil:NilClass" occurs when attempting "Docsplit.extract_pdf" #130

Closed mrmanishs closed 9 years ago

mrmanishs commented 9 years ago

I am using Passenger/Nginx.

When I do Docsplit.extract_pdf on a file, I get a: undefined method `strip' for nil:NilClass error

It works fine when I do it in console and when I do it through "rails server". But, the error comes when I have it going through passenger/nginx.

Here is the exception trace:

"/var/lib/gems/1.9.1/gems/docsplit-0.7.6/lib/docsplit/pdf_extractor.rb:33:in libre_office?'" "/var/lib/gems/1.9.1/gems/docsplit-0.7.6/lib/docsplit/pdf_extractor.rb:128:inblock in extract'" "/var/lib/gems/1.9.1/gems/docsplit-0.7.6/lib/docsplit/pdf_extractor.rb:120:in each'" "/var/lib/gems/1.9.1/gems/docsplit-0.7.6/lib/docsplit/pdf_extractor.rb:120:inextract'" "/var/lib/gems/1.9.1/gems/docsplit-0.7.6/lib/docsplit.rb:65:in `extract_pdf'"

Any assistance would truly help, it's blocking me from releasing a feature.

knowtheory commented 9 years ago

Can i ask what OS/linux flavor you're running on?

Sounds like it's having difficulty finding extractor you'd use to convert files w/.

mrmanishs commented 9 years ago

It's Ubuntu 14.04 Linux.

knowtheory commented 9 years ago

Ah, and i'm guessing you don't have Libreoffice installed then.

mrmanishs commented 9 years ago

LibreOffice is installed. This works fine from "rails console", but when I run it through nginx/passenger, it seems to fall apart at that point... both are being done on the same server.

knowtheory commented 9 years ago

yeah our general recommendation is that you probably don't want to be doing this in the same thread as your web requests. The only other possibility i can think of is that your passenger user might not have access to libreoffice on the command line (or its search path might be different).

mrmanishs commented 9 years ago

I’m checking to see if I may have to specify the libreoffice path in passenger, since it cannot find it.

Stay tuned …

On Apr 6, 2015, at 2:57 PM, Ted Han notifications@github.com wrote:

yeah our general recommendation is that you probably don't want to be doing this in the same thread as your web requests. The only other possibility i can think of is that your passenger user might not have access to libreoffice on the command line (or its search path might be different).

— Reply to this email directly or view it on GitHub https://github.com/documentcloud/docsplit/issues/130#issuecomment-90197688.

mrmanishs commented 9 years ago

Yes, that was it. For those having issues, I put "env PATH" in the nginx.conf file and it works fine now. Closing ticket.

Aquaj commented 7 years ago

Encountered similar issue (Rails app, nginx, no passenger though) and adding env PATH; to nginx conf didn't solve it.

What ended up fixing it was installing the file command on the server after seeing a related error message when trying Docsplit.extract_pdf(...) in rails c.

Documenting for the eventual onlooker that's having my issue.