I am encountering a recurring issue with the ntscraper library, specifically in the _get_tweet_link function. The error that arises is a TypeError: 'NoneType' object is not subscriptable, which occurs when the library attempts to access the ["href"] attribute of an element that is not found (i.e., returns None).
Error Description:
The error occurs in the line return "https://twitter.com" + tweet.find("a")["href"]. This line of code assumes that the find("a") method will always return an element, but in some cases, it returns None, leading to the TypeError when the code attempts to subscript this None value.
During the scraping process, the error occurs intermittently, particularly when a tweet does not contain the expected anchor (<a>) element.
Thank you for your time and effort in maintaining this library.
CODE
import pandas as pd
from ntscraper import Nitter
Hello,
I am encountering a recurring issue with the
ntscraper
library, specifically in the_get_tweet_link
function. The error that arises is aTypeError: 'NoneType' object is not subscriptable
, which occurs when the library attempts to access the["href"]
attribute of an element that is not found (i.e., returnsNone
).Error Description: The error occurs in the line
return "https://twitter.com" + tweet.find("a")["href"]
. This line of code assumes that thefind("a")
method will always return an element, but in some cases, it returnsNone
, leading to the TypeError when the code attempts to subscript this None value.During the scraping process, the error occurs intermittently, particularly when a tweet does not contain the expected anchor (
<a>
) element.Thank you for your time and effort in maintaining this library.
CODE import pandas as pd from ntscraper import Nitter
scraper = Nitter()
def get_tweets_safe(name, modes, start_date, end_date): try: return get_tweets(name, modes, start_date, end_date) except TypeError: print(f"Error al procesar los tweets para: {name}") return pd.DataFrame(columns=['link', 'text', 'date', 'No_of_Likes', 'No_of_tweets'])
Uso de la función con múltiples términos
start_date = '2023-06-01' end_date = '2023-07-23' terms = ["perro sanxe", "perro sanchez", "perro sanche"] data = get_all_tweets(terms, 'term', start_date, end_date)