Open jacobwiberg opened 4 years ago
hi @jwibmetrics ,
if we are interested in the html of the given page we use response.text
. This is what we want if we are interested in getting data that is visible from the page.
If we know that the page we are requesting returns a json file, e.g. when we have found a link that the page calls by inspecting the network monitor and then the XHR tab, we use response.json()
.
Makes sense, thanks for the quick response!
In the first part of session 6 when scraping Jobindex, we used the .json method on the 'response' output from our connector-function. However in the second half when scraping Trustpilot, we frequently used the .text method on the 'response' output.
if response.ok: d = response.json()
and
if response.ok: html = response.text
While I understand that we're looking for links in the second task and a more complete dataset in the first, both of them are scraping tasks. Is there any general rule of thumb on which of the two methods to use, when scraping websites?