rails invalid byte sequence in UTF-8

The code which fails is: Docsplit.extract_text(attachment.path, :output => output_dir, :language => 'spa').

I have tried using:

Docsplit.extract_text(attachment.path, :output => output_dir, :language => 'spa', :no_clean => true)
Docsplit.extract_text(attachment.path, :output => output_dir, :language => 'spa', :no_clean => false)
Docsplit.extract_text(attachment.path, :output => output_dir, :no_clean => true)
Docsplit.extract_text(attachment.path, :output => output_dir, :no_clean => false)

but non of the above is helping, still fails. A lot of other pdf documents works great.

My environment: Rails 4.2 Ruby 2.2 Docsplit 0.7.6 tesseract-ocr 3.03 tesseract-ocr-spa 3.02

Any help please?

documentcloud / docsplit