Chocobozzz / PeerTube

ActivityPub-federated video streaming platform using P2P directly in your web browser
https://joinpeertube.org/
GNU Affero General Public License v3.0
12.95k stars 1.49k forks source link

Banning a user should block their IP #1116

Closed Nutomic closed 4 years ago

Nutomic commented 6 years ago

This would be very useful to prevent spammers from creating multiple accounts. See the discussion at https://soc.ialis.me/@nutomic/100782082492693894

It would also be useful if Peertube showed the signup IP and the last used IP address.

Edit: In case people try to go around this, it might be useful to also block signups from known VPNs and Tor exit nodes (or require mod approval for such accounts).

rigelk commented 6 years ago

Binding account ban and IP ban is overkill and not exempt from side effects. We can add a new kind of ban that would also ban the IP, though.

We won't hardcode IP ranges to block within PeerTube. That would need updating often and we cannot maintain that. Hopefully a flexible solution exists for instance admin as CIDR ranges registration filtering is already a feature since https://github.com/Chocobozzz/PeerTube/pull/573

Now, storing the signup IP and last used IP address could be a thing.

Nutomic commented 6 years ago

Maybe the ban could be optional with a checkbox. And I'm not talking about IP ranges, just the single IP where the user signed up from.

I'm not sure if the config value is enough, as it can only be changed by the admin, and probably requires a Peertube restart.

zicmama commented 5 years ago

It could be done with a fail2ban rule from log?

Nutomic commented 5 years ago

@zicmama I don't think so. The only way fail2ban could tell is if multiple users sign up from the same IP in some time interval. But the spammers only create a few accounts per day, and such behaviour could also be legitimate (e.g. users from a university network).

XenonFiber commented 5 years ago

And I'm not talking about IP ranges, just the single IP where the user signed up from.

@Nutomic I think that was about the "blocking VPNs and Tor" part. People use those tools everyday, and blocking or requiring mod approval for signups from them would be /very/ annoying.

Nutomic commented 5 years ago

@XenonFiber Ah right, that was just an idea, not really part of this issue. Hopefully blocking the signup IP will be enough (and displaying the signup and last used IP).

ghost commented 5 years ago

If we do have an IP blocking system built-in, it would be useful for that to have:

  1. An explicit duration- IP blocks even in firewalls usually have a time limit, since IPs are reassigned.
  2. A list of activities that this IP is prevented from: i.e. registering an account, logging in, watching a video, seeding a torrent, etc.

I'd also be happy with integrating an external IP blocking solution (something that can run on a separate firewall) as long as PeerTube can administer a list or lists that this firewall can read.

McFlat commented 5 years ago

I think this functionality can be done on the nginx server level without having to add so much functionality to the app, if anything we could interface with the api provided if the site is using NGINX PLUS, there is an API, but if not PLUS just NGINX we can add ip to nginx conf file that stores the ip addresses and reload nginx or make it do that automatically. Here's a bunch of articles how it's already been done before. Maybe we can come up with something that's a combination of all of these which would make sense.

NGINX PLUS

NGINX

jtracey commented 5 years ago

Just to point out: IP blacklisting disproportionately affects some populations with collateral damage. Tor and VPNs have been pointed out already, but there's also the fact that certain regions and countries have far fewer IPv4 addresses per capita than others, and NAT and IP sharing doesn't appear to be going away for the foreseeable future. The entire country of Vietnam, for example, is behind one public-facing IP. While I understand IP blacklisting is the norm in a lot of places, this is now often considered to be a flaw, not a successful model of identity. The point being, making this sort of action readily available is, whether you want it to be or not, an implicitly political choice, as I'm guessing it's an issue most sysadmins aren't aware of (and therefore don't understand the choice they're making when saying "sure, I'll take the extra precaution of blocking this person's IP"). Maybe you think this cost is worthwhile, but I wanted to be sure the developers are aware of just some of the ramifications this design decision has.

ghost commented 5 years ago

Great points, @jtracey. I'm not sure what solutions to bad actors ultimately make sense for PeerTube, but it's definitely true that we shoudn't overuse IP blocks. If we include IP blocks at all, I'd expect them to be short-term only and to be used as a last resort, and we should make the software opinionated about this- maybe by avoiding UI options for permanent IP bans, and including a whitelist of known large-scale NATs.

That said, the problems we currently see admins using IP blocks for are things like account creation, no? How much of that need could be replaced with a robust captcha system, or some other way to raise the bar for spammers without inconveniencing legit users too much?

Laurelai commented 5 years ago

IP blocking works well in most use cases. This should not be that difficult and the utter lack of other solutions or even seeming to care about stopping bad actors is why a lot of people are turned off of this project. Theres only so many ways to somewhat identify bad actors and unless you want to get into invasive tactics like flash cookies then you might as well use what works most of the time.

ghost commented 5 years ago

@Laurelai, the point about large-scale NAT is really important here- I think at the very least we should prevent entire countries from accidentally being blocked. Of course moderation and filtering tools are high-priority right now. I have no say in prioritization, I'm just trying to sort through the options available to us in this particular thread, and we don't need to fight about how many people think peertube suck for what reasons, at least not right here.

If we do IP blocks, we should try to combine them with other factors- make them time limited, make the IP block disallow anonymous users rather than blocking access entirely, and we could do some (not foolproof, but better than nothing) client fingerprinting using user agent and possibly other HTTP request parameters.

Many internet services now have some notion of IP reputation or client reputation, but it's used to adjust the level of paranoia the system uses (asking for passwords more, requesting CAPTCHAs, manual review) rather than disallowing access entirely necessarily.

Obviously fancy stuff like that takes time to implement and a potentially heavy-handed IP block is better in the short term than nothing; I just want to make sure we also respect @jtracey's comment about the asymmetry of this solution, and look for something better as soon as practical.

Laurelai commented 5 years ago

Ive been moderating communities for a very very long time and ive found it necessary to block entire countries before. Deliberately. I once banned the entire country of Romania from an irc server because of how many IP's they have available for bad actors. Peertube is federated and its not that big of a deal if someone gets banned from an instance, they can make their own, or view videos on someone elses. The harm is actually limited. If say, some nazi decides to abuse the fact that a wide area only has one real IP, well that sucks and im sorry, but that nazi has got to go. Client fingerprinting doesnt tell you much since the same bad actor can switch browsers with ease, or use their mobile device on the same wireless to evade this. Even if you include softer methods of moderating you should also keep the option of an IP ban as well as code in more drastic measures to stop bad actors. Until you do these things it will become a haven for bad actors to exploit.

McFlat commented 5 years ago

Well sure blocking simply by a persons IP is kinda funky, what needs to happen is we have to use multiple data points to determine, what if we also use the browser user agent on top of the IP address to determine if its specifically that user who is the bad actor? I would recommend against using cookies that stick around forever like flash cookies or the evercookie because partly we want to give users their privacy but we don't want them to abuse the system either, right?

If a bad actor switches their device, from a computer to a mobile phone, their ip address changes too, if they aren't on the same network, eg WIFI

If we use captchas, lets please stay away from reCaptcha, because reCaptcha feeds the whole Google AI system to better recognize things, so then the robots can use that data to recognize and understand things, lets keep the captcha simple without feeding evil AI bots if we can.

ROBERT-MCDOWELL commented 5 years ago

Morality, let's use peerTube to rebuild a friendly humanity... ;)

ghost commented 5 years ago

Of course some admins will block half the world on their firewall and that's their right. And IP seems like one of possibly multiple useful factors to consider in a last-resort system for blocking bad actors. Just saying that the defaults here should not encourage behavior that will result in users from countries with fewer IPs available being disproportionately excluded from the entire network. If we can agree on this, let's work on the details of how that gets implemented.

McFlat commented 5 years ago

Here are some IP reputation libraries on NPM

Why not make a user give their phone number and get a code via SMS, it will then create a cookie and if the cookie set they can use the site, they been validated, otherwise ask them to input their phone number to continue to the site, once they validate with their phone, we don't need their name or other info, just their phone to validate they are real and once that's done a cookie is set and they can use the site. We could make that turn on for bad actor IP address, in the admin we flag a bad actor and boom it shows a phone number entry field to that user to give their phone number to validate that they are real and they can keep using the site.

ghost commented 5 years ago

@McFlat cool. Dunno if lists like spamhaus and rbl will be at all useful here (we're fighting social pests rather than SMTP spammers here) but a framework like iprepd might be useful here, if we don't mind adding a daemon for the purpose.

McFlat commented 5 years ago

@scanlime yeah spamhaus and rbl are wrong use case for this, but I'm sure we can come up with a real nice way to validate a bad actor won't be a bad actor for long, just make them input their phone number when we flag them as a bad actor and it will SMS them a code to input in the site that they see after giving their number, then it sets a cookie. This way if there are multiple people on the same IP some can be bad actors and some not, we ask all of them for their phone and those who validate can use the site, and those who don't validate can't, and if those who validate and keep on acting bad we block their number from being able to continue using the site or getting a code to validate. Problem solved, yeah?

McFlat commented 5 years ago

I think we can stay away from those evil captchas, that google will be using later for the AI bots to go on killing sprees, I hope you all agree. We can come up with a much better solution than those losers.

ghost commented 5 years ago

Yes yes, I hope we can do better than the google captchas, but some form of strong captcha does seem useful. Phone number validation has its own problems, but I like your idea of using it as a fallback in case the IP itself has a mixed reputation.

There are probably a million captcha libraries out there and I've even written one in a previous life, but I suspect the vast majority of them aren't even slightly safe against modern neural nets and GPUs. But we can go looking for something that isn't so bad. Maybe there's even an open source alternative to ReCaptcha that uses this tiny amount of human labor to produce datasets for open source ML software.

McFlat commented 5 years ago

Yeah that would be good, I'm sure we can find something that will do the trick, anything other than the Google services which will spy on everyone loading the library one, and collect data about how to kill their kids two. We could for sure use a mixture of all these, maybe even if a bad actor has multiple phone numbers or it's multiple different bad actors on the same IP we can limit if to a number amount of bad acts that we tried to avoid, hence number of times they gave a number from that IP and then just ban the IP for a day or two, if it happens again after a day or two, then ban the IP for a week, and keep punishing them for longer ban time and notify them about it, that if they keep misbehaving they can't use the site.

We should avoid doing things like how github doe, where you paste one link, that's it it's spam, no more code for you. I think that they wrongfully flag accounts as spam, they don't even know what spam is, they flag regular messages as spam when it's not even advertising anything. We shouldn't reclassify things, we should use the correct classification that applies, if it's a bad link in the comments, ok flag it for bad links, don't flag it for spam, cuz it's not spam. I think the programmers just got lazy or have been given up to a deprived mindset.

Next who know they will start saying things like molesting children is normal and we should pass laws to allow it. That's where it's going, these people are sick in the head. We should keep things on solid ground and don't go into fantasy land just because others are doing it.

That's the main reason for creating something like PeerTube in the first place, because the big wigs will do anything for the money and power, the power to control people who don't know any better. If it wasn't for that we wouldn't even be building this platform.

XenonFiber commented 5 years ago

Just wanted to jump in here and mention that CAPTCHAs are inherently ableist. Even a non-Google solution would exclude people who're visually impaired.

It's important to keep this in mind.

McFlat commented 5 years ago

@XenonFiber In that case there is a sound captcha, but I haven't used it before and am not sure if it's multilingual https://www.npmjs.com/package/avs-captcha

Captcha are really annoying, we can avoid them altogether, whoever made that junk up probably regrets it after all these years. Not only does it NOT stop bots but it wastes time/resources and frustrates real users.

McFlat commented 5 years ago

What we could do to validate real users, in case they share an IP is allow them to use a two-factor authentication method, not for logging into an account but to validate that they are them. They don't even have to have a user account, they would get a SMS code or they could use a yubico key that would say that they are them https://www.yubico.com/

That's about as close as it gets to NOT having an RFID chip implanted under the skin, to validate person. BTW don't get the chip, it's the mark of the beast.

ldidry commented 5 years ago

Even a non-Google solution would exclude people who're visually impaired.

Not necessarily. A captcha can be something as easy than "What is the result of 1+3?". Or an CSS/Aria hidden email field: if not empty, that's a bot.

ghost commented 5 years ago

I'm in favor of non-traditional CAPTCHAs, but it's important to understand the actual threat model they're designed to solve, which is usually spammers who would be writing a script to automatically create accounts for your particular service. In the case of a question like "1+3", that makes a poor captcha because if they know the format of the questions, it's easy to write code to answer the question. Google's captcha works on the principle that the problem should be hard for software to solve even if you know in advance what kind of questions you might be asked.

It's also a great point that CAPTCHAs have severe accessibility problems, and there's often an ableist bias in any solution which tries to sort users into broad categories of normalcy and exceptionality. I'd be a fan of other potential solutions, as long as we keep a realistic threat model in mind as well.

The adversary CAPTHAs would potentially help against are bot admins who would post spam to many servers automatically. I'm not sure how often this currently happens, but I've been seeing some commercial spam and it's reasonable to expect that they have some tool support or will soon. This tooling is going to be specific to PeerTube, so we assume that they know exactly what the code looks like. Something that's hidden to the browser doesn't help, and something like the "1+3" example is just a minor programming hurdle unless the problem you're asking to solve is actually hard for software to solve.

Tbh if I take a step back and think about what I'd want as an admin, I'd rather have people write me a 200 character description of themselves or why they want the account than have them solve a captcha. Then I'll manually approve them or not. Maybe this doesn't scale, but I'm not sure how important that is for individual instances of a federated network like this, since I'd hope we have more moderators per capita than YouTube by design.

McFlat commented 5 years ago

I'm in favor of people writing a description of why they want an account over captcha. It's on a personal level that way, and we wouldn't have to tell people to send an email to an email address to ask for an account. They would basically fill out the registration form and on the next step they would see a description field to fill out why they want an account. Based on that we can see how to act, most spam bots write text that doesn't make any sense, or the text would be the same message over and over again, but in case of a real human being we could spot them and act accordingly. We could also use that as a way to know if the person plans to upload videos or if they just plan to comment on uploaded videos etc. maybe we could have a select box group that can can check some checkboxes which would let us know what they would like the account for. This of course would be required only if say the admin locks it down so no other federated instance accounts can comment on the videos etc

madscientist42 commented 5 years ago

This whole notion presumes one of several things:

1) That IP Address == User. FAIL.

2) That IP Address == The same machine. FAIL

3) That IP Address != More than one machine. FAIL.

IP Addresses can be assigned via DHCP and to either a single machine or a non-routeable IP subnet. They can be any number of machines and only correspond to a given machine for the duration of active traffic.

The fact that you're even CONTEMPLATING this as anything vaild is a bad, bad thing for this project.

ghost commented 5 years ago

@madscientist42 With all respect, this issue is much more nuanced than your comment would indicate. It's true that there are many important cases that IP blocks will completely fail to help with. Having zero reliance on network-level indicators, however, is not a realistic scenario I've seen work for any open organization that has bad actors to block. Do you have any counterexamples?

Laurelai commented 5 years ago

This whole notion presumes one of several things:

  1. That IP Address == User. FAIL.
  2. That IP Address == The same machine. FAIL
  3. That IP Address != More than one machine. FAIL.

IP Addresses can be assigned via DHCP and to either a single machine or a non-routeable IP subnet. They can be any number of machines and only correspond to a given machine for the duration of active traffic.

The fact that you're even CONTEMPLATING this as anything vaild is a bad, bad thing for this project.

Its very clear you have not had to manage any communities of significant size or ones hosting marginalized people. Ip blocking works in a lot of cases, and is a needed tool. The fact that this is even a debate is whats bad.

ghost commented 5 years ago

The fact that this is even a debate is whats bad.

Fwiw there's a lot of nuance being lost here, so perhaps that's why the debate is necessary at the moment.

McFlat commented 5 years ago

Yeah if we don't talk about it how are we supposed to think of all the possibilities. After all, that's the benefit of working on a project as a team or a community, instead of just locking yourself in a shed and knocking it out, for the most part projects become much better when it's a team or community effort. Probably the biggest reason why Microsoft opted in to open source their projects is because they started to see that even if they try to do it all themselves behind closed doors it's not as affective as having a wider community engagement. Now you see more and more tools coming out from Microsoft that are open source, simply because they don't want to lose. Because if they did just lock themselves in that closet, like they did for the majority of their beginnings, they would lose and lose bad. Probably the biggest reason why Google got so big, isn't because they made everything, but because they had the open source community make a good amount of it and they tweak it for internal purposes of spying etc.

The thing is whatever solution we come up with, if it turns out to be pretty solid, most other developers and companies will just copy it, everyone copies each other and tries to make it their own and act so smart like they invented it, while some people working on these open source projects can't even get a real job doing this for a company in the real wold because all those people think they're so smart and are so picky and would rather hire someone that can answer all the questions but can't really do anything except for what they're told. There's a balance, and what happens is that people find what works and use it, just because currently everyone is using CAPTCHAs doesn't mean its the best solution, it just means that most people just do as they're told, and can't really think for themselves so they just follow the heard.

Laurelai commented 5 years ago

The fact that this is even a debate is whats bad.

Fwiw there's a lot of nuance being lost here, so perhaps that's why the debate is necessary at the moment.

Not really. IP blocking is a needed tool despite the need also for other tools. It really boils down to this.

ghost commented 5 years ago

@Laurelai, this is close to being an accurate tldr, but you're leaving out the concerns related to the extreme prevalence of NATs, and as @jtracey pointed out earlier a too-cavalier approach to default IP blocking would disproportionately exclude communities with fewer IPv4 addresses.

I'm surely not saying IP blocks aren't a feature to include, but I'm saying it may make more sense for this to be considered as part of a framework for time-limited and partial limiting of network regions as well as longer-term blocks, and any such framework needs to consider the political realities of the modern internet to some degree.

There won't be a completely black-and-white solution to an issue like this where individual freedoms and moderation and technical concerns come together, and the thing perhaps we need to do most as a community right here on this github box is to all find realistic and humane solutions to complex problems that require that we all admit to some amount of nuance.

Laurelai commented 5 years ago

@Laurelai, this is close to being an accurate tldr, but you're leaving out the concerns related to the extreme prevalence of NATs, and as @jtracey pointed out earlier a too-cavalier approach to default IP blocking would disproportionately exclude communities with fewer IPv4 addresses.

I'm surely not saying IP blocks aren't a feature to include, but I'm saying it may make more sense for this to be considered as part of a framework for time-limited and partial limiting of network regions as well as longer-term blocks, and any such framework needs to consider the political realities of the modern internet to some degree.

There won't be a completely black-and-white solution to an issue like this where individual freedoms and moderation and technical concerns come together, and the thing perhaps we need to do most as a community right here on this github box is to all find realistic and humane solutions to complex problems that require that we all admit to some amount of nuance.

Ive heard all this, i left it out on purpose because IP blocking is still needed, make those softer handed tools too, ive said this as well. Just allow IP blocking too. Its really just that simple. Ive repeated what amounts to this statement more than once and its very clear that this project is against strong moderation tools.

own3mall commented 5 years ago

Is there a way to get Google Recaptcha V2 integrated with PeerTube? This is what we need when registrations are enabled on PeerTube...

Chocobozzz commented 4 years ago

I added hooks so you can create the plugin that blocks the IP you want: https://github.com/Chocobozzz/PeerTube/commit/6f3fe96f4003fd9ad198cdf0ee5a47b32e9e6568

Laurelai commented 4 years ago

Your answer is seriously "do it yourself"? Really?

rigelk commented 4 years ago

@Laurelai the plugin API is more suitable for functions that might require more flexibility and/or dynamicity (i.e. here use a third party service that maintains an updated list of banned IPs).

Laurelai commented 4 years ago

Which is not what was asked for. What was asked for was a way for site admins to block ip's of bad actors themselves. Not an API so someone else somewhere can mabey write the thing we want

Chocobozzz commented 4 years ago

Not an API so someone else somewhere can mabey write the thing we want

Having this issue opened in the peertube repository or on another bug tracker does not change that you require that someone else, somewhere, implements the thing you want.

The good news now is that person (that may be a peertube dev), can do anything they want, and don't need peertube maintainers to publish and release their work.

rigelk commented 4 years ago

@Chocobozzz fyi I ported the issue to my plugin: https://framagit.org/rigelk/peertube-plugin-glavlit/issues/1