meyt / linkpreview

Get link preview in python
MIT License
46 stars 9 forks source link

how to make preview from twitter post? #9

Closed MINIMALaq closed 3 years ago

MINIMALaq commented 3 years ago

I want to create a preview from a twitter link like this but it gave me this

error:
title: JavaScript is not available.
description: We’ve detected that JavaScript is disabled in this browser. Please enable JavaScript or switch to a supported browser to continue using twitter.com. You can see a list of supported browsers in our Help Center.
image: None
force_title: JavaScript is not available.
absolute_image: None

How to use this library for twitter?

meyt commented 3 years ago

Hi @MINIMALaq, you need to just set a crawler UserAgent like this:

from linkpreview import Link, LinkPreview, LinkGrabber

headers = {'User-Agent': 'Facebot'}
url = "https://twitter.com/CSProfKGD/status/1440386526630744073"
grabber = LinkGrabber()
content, url = grabber.get_content(url, headers=headers)
link = Link(url, content)
preview = LinkPreview(link)

print('\n'.join([
    "%s: %s" % (k, getattr(preview, k))
    for k in ('title', 'description', 'image', 'force_title', 'absolute_image')
]))

Output:

title: Kosta Derpanis on Twitter
description: “Tools to Design or Visualize Architecture of Neural Network

https://t.co/8rzEF4ght6”
image: https://pbs.twimg.com/media/E_1Hsf5UUAItlnS.jpg:large
force_title: Kosta Derpanis on Twitter
absolute_image: https://pbs.twimg.com/media/E_1Hsf5UUAItlnS.jpg:large

UPDATE: Thanks to @pothitos, googlebot user-agent won't work for twitter links anymore, changed to Twitterbot/1.0 Facebot.

asmaier commented 3 years ago

Thank you for the hint with the Googlebot UserAgent (for a full list see https://developers.google.com/search/docs/advanced/crawling/overview-google-crawlers). It helps to generate previews for a lot of "problematic" websites.

narayanan-ka commented 1 year ago

Thank you for the hint with the Googlebot UserAgent (for a full list see https://developers.google.com/search/docs/advanced/crawling/overview-google-crawlers). It helps to generate previews for a lot of "problematic" websites.

So the Google bot user agent can be used as a replacement for the header - user agent already defined in the package for practically all the websites or Will it be complementary (to be used side by side along with the existing user agent , perhaps with a if condition for twitter)?

pothitos commented 1 year ago

Hello! Just FYI above Googlebot gets currently a 404 from Twitter (X). But you can successfully use Twitterbot/1.0 as user agent 😉

pothitos commented 1 year ago

Hello! FYI unfortunately today I get a 404 from Twitter, even with Twitterbot/1.0 as user agent.

P.S. Many thanks for this great library!

meyt commented 1 year ago

Thanks @pothitos, Yeah, Facebot user-agent is working for now.