HermanFassett / youtube-scrape

Scrape YouTube searches (API)
MIT License
192 stars 96 forks source link

The scraper returns same result with different page number #36

Closed FashionCStar closed 4 years ago

FashionCStar commented 4 years ago

https://www.youtube.com/results?q=angular&page=1 https://www.youtube.com/results?q=angular&page=2

Youtube returns the same result from the above 2 URLs

Due to your code, it should return different results, but on my side when I input search query in the youtube site search box and do the search, it doesn't show page parameter in the search result URL right?

I want to get a different result for every page number

HermanFassett commented 4 years ago

I think YouTube has been phasing out old page query so it no longer works. The official API appears to use pageTokens with results returning the nextPageToken and previousPageToken. I haven't been able to find anything that would work on the youtube site query side for pagination token yet. I can see that there is a continuationItemRenderer returned at the end of each page of results youtube loads and that contains a token and other data, but I'm not sure if we can use that externally or not.

HermanFassett commented 4 years ago

Okay, I don't think it's possible to allow the page number like was done previously. I've done some initial work that should allow you to get the next page after you've retrieved one page of data.

To do so, the first call would only take in a query e.g. ?q=computers and then that returns the results plus two new values key and nextPageToken. For the second page, you no longer use q, but instead pass just the key as key and nextPageToken as pageToken in the query url, e.g. ?key=Abcd&pageToken=Efgh (the token is muuch longer). That call also returns the same key and new pageToken that can then be used to get the next page.

An update is drafted in branch update-pagination (eb6cc42c050bd1c326b18667c872dac96febe6a9) and may or may not work. Still needs attention.