Closed boncey closed 10 years ago
When you pass in options[:html_headers]
to Readability, it just gets passed on to GuessHtmlEncoding.encode
, which appears to only accept a string. See line 45 of guess_html_encoding.rb: https://github.com/cantino/guess_html_encoding/blob/master/lib/guess_html_encoding.rb#L45
So I see two options. One would be for you to pass in a string of headers, joined by \n
characters, like:
headers = headers.map {|k, v| "#{k}: #{v}" }.join("\n")
Alternatively, you could send a pull request to GuessHtmlEncoding that changes line 45 to just:
encoding = guess(html_copy, headers || '')
and then also changes line 12 to:
headers = headers.dup.force_encoding("ASCII-8BIT").gsub(/[\r\n]+/, "\n")
Cool, thanks. I have it working as described in my working example above at the moment - will re-visit when time allows.
I'm using readability with various HTML responses returned from Faraday.
I've just discovered the awesomeness of the 'guess_html_encoding' gem and am trying to get it to work with readability.
This code snippet works fine: require 'readability' require 'guess_html_encoding' require 'faraday'
However, shortcutting by passing the headers directly to readability fails: require 'readability' require 'faraday'
I see this error:
NoMethodError: undefined method 'gsub' for #<Faraday::Utils::Headers:0x007fb2fc155de0>
Faraday::Utils::Headers subclasses Hash by the way.It looks like readability is assuming the headers are a string - the README is unclear on what type it should be but as guess_html_encoding takes a hash then I think readability should too.
Thanks, Darren.