gottfrois / link_thumbnailer

Ruby gem that fetches images and metadata from a given URL. Much like popular social website with link preview.
MIT License
511 stars 105 forks source link

don't raise on invalid response format while redirecting #120

Closed maia closed 4 years ago

maia commented 7 years ago

Currently the method perform_request in link_thumbnailer/lib/link_thumbnailer/processor.rb will call valid_response_format? before following a redirect. This has the effect that it will return LinkThumbnailer::FormatNotSupported as soon as a single response on the way through all the redirects is not a valid response format, e.g. if it's empty.

I suggest to move raise ::LinkThumbnailer::FormatNotSupported.new(response['Content-Type']) unless valid_response_format?(response) into the case statement when ::Net::HTTPSuccess.

As an example: https://www.90minuten.at/de/red/magazin/reportage/2017/juni/special-violets--vom-schuhe-binden-und-grenzen-ueberwinden has response['Content-Type']: text/html; charset=utf-8 and redirects to: http://www.90minuten.at/de/red/magazin/reportage/2017/juni/special-violets--vom-schuhe-binden-und-grenzen-ueberwinden/. This url has an empty response['Content-Type'] but redirects to https://www.90minuten.at/de/red/magazin/reportage/2017/juni/special-violets--vom-schuhe-binden-und-grenzen-ueberwinden/ with response['Content-Type']: text/html; charset=utf-8. As LinkThumbnailer stops at the second url, it does not manage to return an object with information about that url.

gottfrois commented 4 years ago

Seems to work for me with v3.4.0:

r = LinkThumbnailer.generate('https://www.90minuten.at/de/red/magazin/reportage/2017/juni/special-violets--vom-schuhe-binden-und-grenzen-ueberwinden');nil
r.t => nil
2.6.6 :014 > r.title
 => "Special Violets: Vom Schuhe binden und Grenzen überwinden (90minuten.at)"