abjer / sds2019

Social Data Science 2019 - a summer school course
https://abjer.github.io/sds2019
46 stars 96 forks source link

Ex. 8.2.3 - cannot get any company review links? #27

Open KBaltzer opened 5 years ago

KBaltzer commented 5 years ago

In Ex. 8.2.3 (The Trustpilot case) we are asked to extract links to company reviews from a category page. But when I retrieve the html-code from the category page, there are no review-links in it (I have tried a number of different categories).

If I go to e.g. https://www.trustpilot.com/categories/consultant and >Inspect Element< on a company review-link, I can easily find the company review-links in the html. But when I retrieve the code using request.get() or connector.get(), nothing...

I even tried running the uploaded solutions (exercise_8_sol notebook) as is, but the code in Ex. 8.2.3-4 simply returns an empty list. If I run the following code block (Ex.8.2.5-7), and print the randomly drawn company_links, I also get an empty list (again this is from the uploaded solutions - not my own code).

Am I missing something here?

akaisin commented 5 years ago

I have exactly the same issue...

louisewillerslev commented 5 years ago

I get the same issue - as do the handful of people that I've asked. Have you (the instructors) tried re-running the solution to the exercise? Just to see if it's a general problem. I can say that the problem occurs in several different browsers :-)

kristianolesenlarsen commented 5 years ago

I haven't looked into this very much, but i also get an empty list of results. My best guess as to why this happens is that the trustpilot website has changed since the question was written - i think you can get the necessary data through their hidden api though. Otherwise you need selenium to scrape the links.

If you don't expect to use trustpilot data for your exams i would skip this question.