twitchdev / issues

Issue tracker for third party developers.
Apache License 2.0
73 stars 6 forks source link

Moderation API for Banned users: can't get all banned users #18

Open BarryCarlyon opened 4 years ago

BarryCarlyon commented 4 years ago

Brief description

The API doesn't let me fetch ALL banned/timed out users for a channel

How to reproduce

User this script with a valid broadcaster oAuth and the relevant broadcaster/user ID

const fs = require('fs');
const path = require('path');

const got = require('got');

var users = {};

let page = 0;
fetchPage = function(c) {
    got({
        url: 'https://api.twitch.tv/helix/moderation/banned',
        searchParams: {
            broadcaster_id: 'ACasterID',
            after: c,
            first: 100
        },
        headers: {
            authorization: 'Bearer AnOauthForCaster'
        },
        responseType: 'json'
    })
    .then(resp => {
        page++;

        for (var x=0;x<resp.body.data.length;x++) {
            users[resp.body.data[x].user_id] = 1;
        }

        console.log('Page',page,'has',resp.body.data.length,'users',Object.keys(users).length);

        fs.appendFileSync(path.join(
            __dirname,
            'pages',
            page + '.json'
        ), JSON.stringify(resp.body,null,4));
        if (resp.body.pagination && resp.body.pagination.cursor) {
            fetchPage(resp.body.pagination.cursor);
        }
    })
    .catch(err => {
        console.log(err);
    });
}
fetchPage('');
$ node fetch.js
Page 1 has 100 users 100
Page 2 has 100 users 200
Page 3 has 100 users 200
Page 4 has 100 users 200
Page 5 has 100 users 200
Page 6 has 100 users 200
Page 7 has 100 users 200
Page 8 has 100 users 200
Page 9 has 100 users 200
Page 10 has 100 users 200
{ HTTPError: Response code 500 (Internal Server Error)
    at EventEmitter.emitter.on (SOMEWHERE/node_modules/got/dist/source/as-promise.js:109:31)
    at process._tickCallback (internal/process/next_tick.js:68:7) code: undefined, name: 'HTTPError' }

Additionally page 2 thru 10 are identical

Expected behavior

Page 1 has 100 users 100 Page 2 has 100 users 200 Page 3 has 100 users 300 Page 4 has 100 users 400

and so on till all banned users are fetched

Additional context or questions

Original Post: https://discuss.dev.twitch.tv/t/bug-moderation-api-for-banned-users-is-bugged/23483

mauerbac commented 4 years ago

HLX-1337 ;)

RokuHodo commented 4 years ago

I also observed this behavior and did a little more testing. It seems like the problem might be with first. When I request a page with a low first value, I am able to request all banned users with no issue. It's when I set first to a high value that duplicate pages are returned infinitely until the 500 - Internal Server Error is returned. The same behavior Barry outlined.

Context

Currently, in my channel there are 6 banned users. Whenever I specify first with a value less than 6, I get exactly the number of requests back and the number of users back. Fore example, when I set first=1, I make 6 requests and get the 6 banned users back. When I set first > 6, I only make one request and still get 6 users back as expected, to a point. For some reason, when I set first to a high value, I start getting the behavior described above. For me, this value is when first >= 94.

Test 1 - first=93

Request: https://api.twitch.tv/helix/moderation/banned?broadcaster_id=45947671&first=93

This request works fine and I only get the 6 banned users: devenv_2020-05-16_18-07-14

Test 2 - first=94

Request: https://api.twitch.tv/helix/moderation/banned?broadcaster_id=45947671&first=94

This test exhibits the infinite duplicate page behavior resulting in the 500 - Internal Server Error. This is the data returned after 5 cycles: devenv_2020-05-16_18-26-19

JaielZeus commented 4 years ago

At the moment it is just a broken API endpoint and it has been for the last half a year. I am still waiting for it to get fixed. This should have been tested and evaluated in the development before shipping it out tbh. It is such an easy and obvious bug to catch.

BarryCarlyon commented 3 years ago

Tested today

Still broken

node fetch_helix_banned.js
Go
Page 1 has 100 users 100
Go eyJiIjpudWxsLCJhIjp7Ik9mZnNldCI6MTAwfX0
Page 2 has 100 users 200 eyJiIjpudWxsLCJhIjp7Ik9mZnNldCI6MTAwfX0
Go eyJiIjp7Ik9mZnNldCI6MH0sImEiOnsiT2Zmc2V0IjoyMDB9fQ
Page 3 has 100 users 200 eyJiIjp7Ik9mZnNldCI6MH0sImEiOnsiT2Zmc2V0IjoyMDB9fQ
Go eyJiIjp7Ik9mZnNldCI6MTAwfSwiYSI6eyJPZmZzZXQiOjMwMH19
Page 4 has 100 users 200 eyJiIjp7Ik9mZnNldCI6MTAwfSwiYSI6eyJPZmZzZXQiOjMwMH19
Go eyJiIjp7Ik9mZnNldCI6MjAwfSwiYSI6eyJPZmZzZXQiOjQwMH19
Page 5 has 100 users 200 eyJiIjp7Ik9mZnNldCI6MjAwfSwiYSI6eyJPZmZzZXQiOjQwMH19
Go eyJiIjp7Ik9mZnNldCI6MzAwfSwiYSI6eyJPZmZzZXQiOjUwMH19
Page 6 has 100 users 200 eyJiIjp7Ik9mZnNldCI6MzAwfSwiYSI6eyJPZmZzZXQiOjUwMH19
Go eyJiIjp7Ik9mZnNldCI6NDAwfSwiYSI6eyJPZmZzZXQiOjYwMH19
Page 7 has 100 users 200 eyJiIjp7Ik9mZnNldCI6NDAwfSwiYSI6eyJPZmZzZXQiOjYwMH19
Go eyJiIjp7Ik9mZnNldCI6NTAwfSwiYSI6eyJPZmZzZXQiOjcwMH19
Page 8 has 100 users 200 eyJiIjp7Ik9mZnNldCI6NTAwfSwiYSI6eyJPZmZzZXQiOjcwMH19
Go eyJiIjp7Ik9mZnNldCI6NjAwfSwiYSI6eyJPZmZzZXQiOjgwMH19
Page 9 has 100 users 200 eyJiIjp7Ik9mZnNldCI6NjAwfSwiYSI6eyJPZmZzZXQiOjgwMH19
Go eyJiIjp7Ik9mZnNldCI6NzAwfSwiYSI6eyJPZmZzZXQiOjkwMH19
Page 10 has 100 users 200 eyJiIjp7Ik9mZnNldCI6NzAwfSwiYSI6eyJPZmZzZXQiOjkwMH19
Go eyJiIjp7Ik9mZnNldCI6ODAwfSwiYSI6eyJPZmZzZXQiOjEwMDB9fQ

Cursor is display in this output

Can't get beyond 200 users and it gets stuck in a loop

Using a limit/first of 50, I get 200 users and it loops till page 41 where it 500's

Page 38 has 50 users 200 eyJiIjp7Ik9mZnNldCI6MTc1MH0sImEiOnsiT2Zmc2V0IjoxODUwfX0
Go eyJiIjp7Ik9mZnNldCI6MTgwMH0sImEiOnsiT2Zmc2V0IjoxOTAwfX0
Page 39 has 50 users 200 eyJiIjp7Ik9mZnNldCI6MTgwMH0sImEiOnsiT2Zmc2V0IjoxOTAwfX0
Go eyJiIjp7Ik9mZnNldCI6MTg1MH0sImEiOnsiT2Zmc2V0IjoxOTUwfX0
Page 40 has 50 users 200 eyJiIjp7Ik9mZnNldCI6MTg1MH0sImEiOnsiT2Zmc2V0IjoxOTUwfX0
Go eyJiIjp7Ik9mZnNldCI6MTkwMH0sImEiOnsiT2Zmc2V0IjoyMDAwfX0
{ HTTPError: Response code 500 (Internal Server Error)
JaielZeus commented 3 years ago

Is there any chance this will be fixed soon? Has been open for over 13 months now

Nocturnz commented 3 years ago

I really am hoping that they will reopen this since currently trying to pull all bans is sorta impossible....

BarryCarlyon commented 3 years ago

This issue is still open btw. Linked issues are closed. but this one still open

Nocturnz commented 3 years ago

okay cool

BarryCarlyon commented 3 years ago

How does you code get around the problem? Since the problem is that you can't load all the bans from that API

And all your code does is load all the bans from the API?

virtualdxs commented 3 years ago

My apologies, I misread which issue this was (I thought this was the issue regarding duplicates when fetching the last set). Deleted my comment

BarryCarlyon commented 3 years ago

The issue is that you keep getting the same duplicate page from ban 200 onwards, so you cannot obtain all bans for a channel.

virtualdxs commented 3 years ago

Yep, I see that now. The issue my code worked around was even when you have fewer than 200 pages (say 20), you'll get duplicates between the second-to-last and last page (unless you have an even multiple of $pagesize bans), then you'll get the last page repeatedly. Different but related issue.

BarryCarlyon commented 3 years ago

At 100 bans per page, I can't get past page 2.

virtualdxs commented 3 years ago

Misread your comment; seems I'm a bit hazy today. I initially hit the issue using the default page size (20 IIRC), then later increased my page size to 100, and the largest streamer currently using my app has 114 bans, so I unfortunately (fortunately?) haven't hit that 200 threshold yet.

JaielZeus commented 3 years ago

Well as I stated in my original post (https://discuss.dev.twitch.tv/t/bug-moderation-api-for-banned-users-is-bugged/23483) about this issue, soon to be 2 years ago and still not solved btw(!!!), you can get up to 1170 unique bans if you only query 3 at a time and use the cursors from that. I didnt bother to check the number for 1 at a time. I tested it on a large channel with a lot of bans.

But thats just not a solution, Twitch pls fix!

BarryCarlyon commented 3 years ago

This seems to be resolved as of Friday.

If others can test/confirm?

willlllllio commented 3 years ago

Works for me on a channel with about 7300 bans now that previously failed like above.

badoge commented 3 years ago

Still broken for me, I get 4587 users in 46 pages then next response is just "data: [], pagination: {}" I know that the response is incomplete because it's missing some users that I manually banned.

BarryCarlyon commented 3 years ago

I know that the response is incomplete because it's missing some users that I manually banned.

The API/Service won't return record(s) for users that no longer exist on the platform.

I anticipate the people you think are missing, their accounts not longer exist, having been reported and had the accounts terminated

badoge commented 3 years ago

The banned users are still active so they should appear in the output. Almost all of the bans returned are from recent automated bot ban waves which can be thousands at a time so the users I'm looking for are not returned because the endpoint stops returning data after 46 pages.

BarryCarlyon commented 3 years ago

automated bot ban waves

You mean by Twitch or using someone elses tool?

As if it's Twitch "automated removal" then those users probably won't be in API outputs. Since Twitch did a "automated bot ban" those uses should be TOS'ed and not be in API outputs.

badoge commented 3 years ago

There are bots that automatically ban accounts that are known to be spam/bot accounts. What I meant is that all the bans returned by the API are created by this bot, and my own manual bans are so old that the endpoint stops returning data before reaching them.

Emilgardis commented 3 years ago

Do the "manually" banned accounts show up as banned if you do /user <account> in the specific channel?

badoge commented 3 years ago

Do the "manually" banned accounts show up as banned if you do /user <account> in the specific channel?

yes

BarryCarlyon commented 3 years ago

What I meant is that all the bans returned by the API are created by this bot, and my own manual bans are so old that the endpoint stops returning data before reaching them.

Ah.

I bet the API can only reliably go back to like august 2014~ish?

I forget the specific date but there is a "default" date in the /user username card for "we know they are banned but we don't have the date for this user for when they got banned" so the "summary" API (this API) can't get them due to "incomplete" data on Twitch's side to hydrate the response with.

badoge commented 3 years ago

I bet the API can only reliably go back to like august 2014~ish?

No, the bans I'm looking for are from march of this year.

BarryCarlyon commented 3 years ago

Curses....

I'll have a deeper look at my Pull and run a compare if I get time.

And still broken. But progress.

Most of my usage usually involves a "see if x is banned" rather than needing the full list. But still

pixelsuchter commented 3 years ago

can confirm, getting full list doesn't work, for me it stops after about 6135 names. if i unban some i get a partially different list with similar length

travistyoj commented 3 years ago

Somehow the original ticket was marked as solved, but I've reopened and we are actively investigating