twitchdev / issues

Issue tracker for third party developers.
Apache License 2.0
73 stars 6 forks source link

Cursor on last page of Streams endpoint leads back to first page, causing infinite loop #93

Closed marcospgp closed 4 years ago

marcospgp commented 4 years ago

When crawling all streams of language de, the cursor on the last page is something like ZXlKeklqb3hMakF6T0RBM05EQTRNelF3TWpNd05EWXNJbVFpT21aaGJITmxMQ0owSWpwMGNuVmxmUT09IA. Following this cursor seems to return the first page again, restarting the whole process and getting the application into an infinite loop.

I believe the IA ending relates to other issues posted previously such as:

https://discuss.dev.twitch.tv/t/streams-still-returning-ia-for-cursor-pagination/23859

https://discuss.dev.twitch.tv/t/get-streams-repeatedly-returning-cursor-ia/23348

For now I am able to stop the crawling once a stream is found with less than 10 viewers, but this could fail if there are no such streams, which would send the application into the infinite loop.

BarryCarlyon commented 4 years ago

Duplicate of https://github.com/twitchdev/issues/issues/2

Also the general consensus has been that the cursor can be used to paginate backwards, so it's up to the developer to determine if they have looped back to page 1 or not. (There has been a LOT of discussion on this on the discord)

marcospgp commented 4 years ago

Thanks for the information. My personal opinion is that this is wrong.

marcospgp commented 4 years ago

@BarryCarlyon How can one verify that the current page is page 1?

BarryCarlyon commented 4 years ago

Give that by the time you loop back around to page 1

The contents of page 1 is different to the page 1 you fetched.

I'd probably go "well the first entry on this page has more viewers than the last user I fetched, so I might be on page 1 again", since it's generally sorted by viewer count. So if the first stream on page x has 10 viewers, then the page x+1 has 10k viewers, you are back to page 1.

But I don't use the get streams endpoint to try and get all streams that are live, (I don't need this information), due to huge long list of reasons that have been often discussed on the Discord

Since a streamer on page 3 and then appear on page 4, in the time it takes for you to load page 3, and start loading page 4. General consensus tends to be when you get to <10/15 viewers stop paginating there as people jump about all over the place, as a cursor doesn't represent data at a fixed point in time. So page x+1 changes in the time it takes to load page x, extract the cursor and start loading the next page.