justintv / Twitch-API

A home for details about our API
www.twitch.tv
1.72k stars 379 forks source link

Attempts to query followers past the first 1600 returns a 403 error #320

Closed LlamaChomp closed 8 years ago

LlamaChomp commented 9 years ago

Is this intended? It only started happening tonight, as far as I can tell. Do I need to authenticate somehow to get past this new barrier?

moocat commented 9 years ago

Having the same issue. https://api.twitch.tv/kraken/channels/lirik/follows?direction=DESC&limit=100&offset=50000 with _total: 717677

FugiTech commented 9 years ago

This is not a bug. We've begun limiting the offset on follow endpoints to 1600 due to scrapers causing excessive load. If this is a problem for you please let us know what feature you're building that requires loading follows past 1600.

moocat commented 9 years ago

Obviously this is needed to fetch the full follow list, so I don't have to do 717,677 requests to the API for just Lirik's channel, checking each individual user if they are following. When you have the full one sorted you can easily backtrack and renew the local copy whenever. This is going to be a huge issue.

FugiTech commented 9 years ago

But why do you need the full follow list for a user?

Boblekonvolutt commented 9 years ago

To track the follower status of people without requiring individual calls every single time? The same reason the /follows endpoint exists in the first place?

moocat commented 9 years ago

As I said, you need to have the full follow list to determine if a user is a follower or not, locally. So you don't have to check every user individually with a full request for each one. What I use this for at the moment is keeping track of new followers for follower alerts, and not notifying for old followers. In the future I'd like to use it for a loyalty system where I can reward followers differently than non-followers. I can't keep shooting off thousands of requests every minute just to check who is following or not on an individual level.

EDIT: What I'd have to do now whenever I see a user in the chat sending a message or being in the userlist, I have to send off a full request to that individual user to check his/hers follow status. I could cache the users following, but the ones not following would need a request every time. Would it work for a single channel, yes, but for thousands? No chance.

At least add a scope so authenticated apps can access this endpoint without those huge restrictions.

Boblekonvolutt commented 9 years ago

This could just as easily be solved by requiring authentication past some arbitrary point, and that wouldn't break the functionality for many people. As would clear and transparent rate limiting and guidelines. Or any targeted countermeasure. A sweeping response that removes it for everyone seems very rash. The point of an API is to provide functionality, and in the case of Twitch, due to its inherit instability and unreliability it is almost a given to maintain a local state of the things you care about. If this is a permanent change, it will most likely result in many more API calls, worse performance and less satisfied users.

BarryCarlyon commented 9 years ago

I would also like the full follower list.

We are building a application that grabs a random follower and does something with the name on a on screen overlay, now if I am limited to 3200 people what about the other 300,000 for a large channel?

Rather than hammering us all for scraping, as there -is- was no easy way to get the whole list of followers (quickly), a endpoint is needed.

I echo similar opinions of the others on this thread, in the long run it's likely to result on MORE load on the API as we try to find work arounds.

Edit/Worry: On a unrelated point, is this going to happen to the subscribers end point too? Soon? Edit: You've made it so the casters we work for "are unable to care for all their followers since we the developers cannot get the whole list of followers"

bashtech commented 9 years ago

Edit/Worry: On a unrelated point, is this going to happen to the subscribers end point too? Soon?

Hopefully not as this would be a huge blow to a lot more application than followers are currently.

moocat commented 9 years ago

Edit/Worry: On a unrelated point, is this going to happen to the subscribers end point too? Soon?

Hopefully not as this would be a huge blow to a lot more application than followers are currently.

You say that like there is a difference here other than the users being Twitch partners. I like offering my services and features to everyone on Twitch, not just the 0,67% of Twitch that are partnered (10K partners, http://www.twitch.tv/year/2014)

Followers are like subscribers to smaller casters, and 1600 followers for this limit is a minuscule amount, even for casters that have just started on Twitch.

bashtech commented 9 years ago

My point is that there are a lot more applications that query entire subscriber lists than followers and such a change to that API would be /very/ visible.

expertsonline commented 9 years ago

I would like to echo the concerns raised here. If we were to make our applications query the follower status per user, instead of a single per channel local cache, it will significantly increase the number of requests I will have to make. A single refresh on first launch of my application with subsequent local caching ensures I tackle most queries locally.

As for uses, loyalty bot, raffle management and user management layers of my software (DeepBot) use this information to allow streamers to acknowledge, reward and interact with followers.

tl;dr; Should I have to change my code to query per user, it would increase the load on the API rather than reduce it.

BarryCarlyon commented 9 years ago

I theorise that a possible solution is to create a CSV (or JSON) pre compile end point

We/The Developer submit a request for for a CSV/JSON file containing a followers dump to be created. And then we poll for when this file is ready and download the file.

Similar to how eBay allows you to download all your listings (in the world of eCommerce). (I make a download file request and get a email when it's ready and fetch from the link in the email).

That way Twitch controls the load as it can "lazy" generate the requested files and combine multiple requests for the same file to be together. The only choice if Twitch's choice of how to tell the developer their file is ready. And how long a file should be consider "not generatable able yet". Say maybe an hour for example. New requests for a file of age less than a hour insta-redirects to the file (suitable HTTP codes/json packets on the API endpoint)

Then we as developers request and grab the file as needed and poll the last page as needed. But still can be bump the page limit to 1000?

This "file request system" can easily be copied over to the subscribers end point too. Which then means for followers/subs we make 2 (to 5, if we poll the end point asking if the file is ready every minute till the file is ready) requests to the API. Instead of 1 request per 100 records (for either end point)

Edit: This makes it EVEN easier for us to process the response as we don't have to worry about combining pages!

iceman50 commented 9 years ago

Why not allow the full follower list to be queried and then sent as a compressed (bz2?) stream. Doing this would ultimately save network resources and (hopefully?) frustrations everyone is having.

night commented 9 years ago

Why not allow the full follower list to be queried and then sent as a compressed (bz2?) stream. Doing this would ultimately save network resources and (hopefully?) frustrations everyone is having.

This has probably nothing to do with networking, but more so how databases offset data. Typically when you request an offset from a database, it cannot just "jump" to where the offset is. Rather, it has to seek to that offset first. Offsets can be troublesome with database performance, and is probably why the offsets are now limited.

jonsandman commented 9 years ago

when trying to load all my followers My API gets an access restriction at one point around 1700 followers. I see this is a issue for quite a few people. I am using my own bot so I need the full list. Is someone from Twitch looking into this issue?

amclay commented 9 years ago

@jonsandman please read @Fugiman's reply from earlier. Our highest priority is to the users of the twitch.tv website, which unfortunately means if one of our external services is causing issues for normal users to load web pages, that needs to be addressed immediately.

We're internally discussing what we can do to reduce the impact, and many of the suggestions here so that we can continue to offer this type of thing to our external developers like yourself.

jonsandman commented 9 years ago

@amclay So in the meantime I should wait it out and add good old "nightbot" to my chat? I just find it odd that they would make such a drastic change and not at least give us a heads up about it. I am sure there are some big streamers out there who use their own external developments. I personally make a living on twitch and with this new "addition" I am unable to receive new followers, donations, and completely lose the energy in the chat without the bot. I am not blaming you, just am surprised by the change happening with no words or warnings.

LlamaChomp commented 9 years ago

@amclay I agree your highest priority should be to users of the website. However, you should consider how they're using it. An unannounced major change to API functionality that breaks the bots your users rely on was probably a bad move. The interactive chat experience is a pretty important part of a live event. Average Joe User probably considers the chat bots and third party overlay announcements part of the twitch.tv user experience.

I don't know how bad the excessive load from the scrapers was, but you probably could've given developers a week to adjust to the new paradigm before launching it.

EDIT: My personal adjustment was fairly easy to make, but my bot was spitting out errors last night while I was busy streaming and couldn't investigate the problem. Having to keep telling people I don't know why the bot is broken isn't the kind of production value and professionalism I try to bring to my channel. 15 minutes of warning would've been plenty for me.

gmt2001 commented 9 years ago

If it helps reduce the expensive database calls, would it be possible to create a new /followers endpoint which allows downloading a full follow list again? The reduction in server load could, if your current design allows, be effected by making the output in the format of

{ _total: 3, _links { links here }, follows: [ "gmt2001", "fugi", "moocat" ] }

It would allow having a full list of followers, and then the more expensive full user info lookup for followers not in the normal /followers endpoint could be performed separately by the application through the /users endpoint.

Additionally, this endpoint could be cached in intervals of say 15 minutes, with no way to bypass the cache.

scagood commented 9 years ago

just as a side note it's a bit confusing when in _links.next is over the offset limt Using lirik as my example https://api.twitch.tv/kraken/channels/lirik/follows?offset=1600 the api returns: { "follows": [], "_total": 701195, "_links": { "self": "https://api.twitch.tv/kraken/channels/lirik/follows?direction=DESC&limit=25&offset=1600", "next": "https://api.twitch.tv/kraken/channels/lirik/follows?direction=DESC&limit=25&offset=1625", "prev": "https://api.twitch.tv/kraken/channels/lirik/follows?direction=DESC&limit=25&offset=1575" } }

Geekster-Alan commented 9 years ago

I require a full follower list as it is the way my bot does Minecraft Server Whitelisting for my Followers Only. Its a rare occurrence to have a Follower Only Server, not a Sub Server, and i need to pull my all followers to maintain the whitelist. Without this, people can continue to use the server without Following the channel. I hope there can be a solution created to gain the full follower list.

EfficiencyVI commented 9 years ago

Ehm we use this feature to check if people are still following because 90 % of all unfollows are made by accident or "randomly unfollowed by twitch" as we call it. So instead of making 100 requests, I should make 1 requests for every single user (=> over 10,000)? I don't know if this is a good solution. It also breaks a lot of follower alerts as we cannot check if a user followed before.

rbozan commented 9 years ago

If the twitch api developers don't want scrapers to load the server, why don't you make it so you limit the offset for non authorized users, but it's unlimited for authorized users?

belthesar commented 9 years ago

I definitely commiserate with the need to reduce load from API scraping, and am happy to adjust to a new way of polling all followers. I know in my case, the IRC bot I'm maintaining polled the complete follower list at runtime to avoid polling it needlessly and be API resource conscious.

I'm happy to be additionally API conscious by maintaining a local DB instead of regenerating a list every time the bot reloads (which, unless development causes issues is very few and far between) but some sort of mechanism to seed that DB and refresh it on a semi-regular interval would be required at that point. A lazily available list compilation that could only be requested every X hours would be a perfectly fine method of achieving that, but anything you provide that can accommodate that is fine by me.

As a stop-gap, would it be possible for us to get a complete follower lists for a channel in the interim? I understand if this isn't a realistic request, but I'd love to start building against a DB, however it's maintained, and having a full seed would be helpful to start that process.

Domali commented 9 years ago

There are a lot of good reasons to have a full follower list. Whats even more curious is the fact that there is still pagination with this change. My last 1600 followers is about as valuable to me as my last 100 followers. I'd really like to see some way I can pull down a full list of followers quickly and efficiently - even if its a file that can be out of date by like an hour, that is still much better than not being able to see all of your followers. I really hope this issue gets some traction and response =/

WorldSEnder commented 9 years ago

Wouldn't it be possible to reduce the amount of polling required when, instead of the full user info only the name and a link to the profile is returned? As it seems to me in most cases one only wants to know the name of the follower and in case more is needed the profile can be requested.

{
"follows": [
    {
        "created_at": "2015-03-06T02:09:14Z",
        "_links": {
            "self": "https://api.twitch.tv/kraken/users/connorhd3/follows/channels/lirik",
            "user": "https://api.twitch.tv/kraken/users/connorhd3/"
        },
        "name": "connorhd3",
     }, //....
], //...
}
Colten45 commented 9 years ago

My 2 cents, I really like scagood and iceman50 idea though. A simple array of usernames then sent via gzip. Any speedy or http/2 options available?

lunddk commented 9 years ago

You need the full list to compare with if you want to provide notifications to your users in the external apps. Currently you have to fetch the subscribers list to see how many exists but if this information is added to GET /streams (just like followers are), some load could be avoided. Fact is that your API is missing information in some parts so perhaps the way to go would be to add it so that people wont have to fetch the lists unless the number in GET /streams has changed (please tell me if I'm wrong here).

What I'd like to see of information in GET /streams is; sub count, host count and a more exact count of followers (I've noticed that the follower count isn't the same as _total in GET /channels/:channel/follows). In this way, I'll only have to load the full list on load and can settle with only a few results on any upcoming requests.

Hope I wrote this so people can understand what I mean.

CircuitSix commented 9 years ago

Has anyone found a work around for this? Trying to get a full list of followers has put a halt on my development as it is required for many features. My need for a full list echos the needs of most other in this thread.

belthesar commented 9 years ago

The only workaround at this point would be an undocumented API endpoint. Twitch API development has relayed no public progress regarding providing a method for consumers to ingest all followers.

amclay commented 9 years ago

There are no undocumented endpoints for this. We are continuing to work on this issue, but do not have a public ETA on when this would be available again.

BarryCarlyon commented 9 years ago

What I'd like to see of information in GET /streams is; sub count, host count and a more exact count of followers (I've noticed that the follower count isn't the same as _total in GET /channels/:channel/follows).

The _total difference you are seeing is due to a cache difference between the end points. So it's not "broken" really, I would not expect the _total in the channel object of the stream object to be "as up to date" as the _total in the followers object. But overall if you are poking the API enough to spot this difference, I think you are loading too much/many end points at once.

And I don't think Twitch will combine more data this way, especially since there is very limited Host data in Kraken, AND the sub count requires authentication. So that will likely stay on it's own end point for efficiency.

Boblekonvolutt commented 9 years ago

So, now that five months have passed, perhaps the API documentation should mention this seemingly permanent change, and the "_links" should not include broken links past the offset limit?

SirRippovMaple commented 9 years ago

Has there been any news on this? Any workaround?

wUFr commented 9 years ago

so i spent 1 hour debuging what im doing wrong and now googled this... hmmm strange limitation...

scagood commented 9 years ago

It is exactly as you said, a limitation. It was recently introduced to help reduce server load from people who are constantly requesting the entire list of followers.

If you are looking for an entire list of people following the channel hen I think for the moment you are out of luck

Hope this helps, Seb

wUFr commented 9 years ago

well you could make some simple method as mentioned earlier - list of just names, datetime of follow and nothing else so it wont use as many resources getting all these data... or downloading csv file thats automaticaly regenerated (for example) every 30 mins, so all your server gonna do be 1 refresh every 30 mins and sending the file without any database requests. And people that need to check last followers every 1 minute (for notifications etc) they can still use actual way.

Cmon go get some coffee and make it real! ..JUST DO IT! :D

jahollingsworth commented 9 years ago

Would also like to see this work or have a suitable work-around.

RyanTheAllmighty commented 8 years ago

Just wanted to put my hat into this issue and would like to see this fixed back to allowing us to query full follower lists again somehow

FugiTech commented 8 years ago

This is now fixed. We've added a cursor parameter that allows you to continue pagination in the direction you were already going, and is returned in the response as _cursor. We've also updated _links.next to use the new parameter, so anybody using that url will be upgraded automatically.

macharborguy commented 7 years ago

I do not know if this is indented or not, but occationally the cursor value is not appearing in the "next" url. In addition, I am receiving Status Code 400 Bad Request messages against saying that the "offset" value cannot go above 1600.

I am using the "_links.next" parameter as suggested above. Sometimes the cursor value appears in that url, sometimes it does not.

for the status code 400 warning about the incorrect offset value, the cursor value WAS part of the url.

i only noticed this issue today when I restarted our channel bot. We are properly including our client ID in the header.

We only do full checks of our full follower list every 6 hours when our channel is off air, and we do this to make sure we have an up to date list of not only who is following us, but if anyone has unfollowed.

Please look into this issue.

Again, even with the cursor value included in the URL AND using the _links.next parameter from the request results, the warning about the offset value being too high is appearing again.

macharborguy commented 7 years ago

UPDATE: here is the status URL in question: {"error":"Bad Request","status":400,"message":"The parameter \"offset\" was malformed: the value must be less than or equal to 1600"}

and here was the URL from the previously received _links.next entry: https://api.twitch.tv/kraken/channels/<channel_name>/follows?channel=<channel_name>&direction=desc&limit=100&offset=1700

Notice the lack of a "cursor" entry.

I ended my code to manually add the cursor value to the end of the URL, potentially doubling up the cursor entry in some URLs. However now sometimes, almost seemingly at random, the _links.next URL changes the "offset" value back to 100, instead of moving to the next set. I say "seemingly at random" because in testing it has happened at the 300 record mark, 1200 mark, and 1400 mark.