Open ravindersaluja opened 4 years ago
I will fix it tomorrow. Possible they changed their API
I test a non-api route and it worked. Coding on my iPhone. So I think the API has changed
import json
from requests import Session
from bs4 import BeautifulSoup
URL = 'https://dk.trustpilot.com/review/www.if.dk'
session = Session()
r = session.get(URL)
soup = BeautifulSoup(r.text,'html5lib')
data = soup.find('script',{'type':'application/ld+json'})
print(json.loads(data.getText(strip=True)))
@Proteusiq Even I thought so and tried scraping with requests
and BeautifulSoup
. But when I am looping over to get all the reviews and then checking the website manually through the browser, I found that it detects suspicious behavior on my IP and then I have to verify that "I am a human".
I see that I can restore it using BeautifulSoup with sleep function. I will wait in fixing it to find out the legality. As BeautifulSoup will overload their servers if this project is misused.
@Proteusiq Did you try anything further on this?
@Proteusiq The scraping of reviews is no longer working. Calling
t.get_reviews()
gives out an empty defaultdict likedefaultdict(list, {})
.