fedora-infra / noggin

Self-service user portal for open-source communities to use over FreeIPA.
MIT License
107 stars 55 forks source link

As a community member (in any capacity), I want spammers to be curtailed ASAP, so that they don’t annoy me. #27

Closed sfinn85 closed 1 month ago

puiterwijk commented 4 years ago

I honestly don't think that this is a community member question, but more an operator feature, but that's nitpicking.

relrod commented 4 years ago

(This will require some changes to Basset upstream.)

sfinn85 commented 4 years ago

Sprint planning 3 meeting:

How are they being spammed? More context needed @relrod

abompard commented 4 years ago

@relrod, do you know what kind of changes will be required in Basset? How should we integrate with it?

abompard commented 4 years ago

Hey folks. This has been opened since the veryt beginning and it still seems unclear what needs to be done. According to this email thread Basset needs some work to stay up-to-date with spammer practices and the team considers configuring it to not flag accounts as spam.

If we want to integrate the new AAA system with Basset, we need to know how this can be done, and how we should react to Basset's response.

According to the process described in our documentation, Basset's integration in the registration process is pretty deep, and we would need to make quite a few changes to replicate it. There would also be security questions to handle, because at the moment Basset itself connects to FAS to set the spam status. With Basset's architecture, we would thus need to write a Basset plugin that would connect to FreeIPA with appropriate permissions and do the equivalent action (moving the user from the stage users category to the active users category). One of the problems we would face is that FreeIPA does not allow a stage user to change their password, so this spam check can't be done at the end of the regsitration process. One of the ways of doing it would be, for example, to have the following registration process:

  1. User registers on the front page
  2. Noggin generates a token
  3. Instead of sending the token by email for validation, it hands over the token (or rather the full link) to Basset
  4. Basset does the spam verification, and if the user is accepted it sends the validation email.
  5. User receives the email, clicks on the link and the registration proceeds as before from there.

However, in this process I don't see how we could handle the "manual" spam check status. Noggin has no database, so it would have to be a field set in IPA that admins would go to and somehow check. But what would be the process from there on? The user still hasn't received the validation email and Basset will not know that the admin has approved the user.

Another option would be to introduce the Basset call on the page that the user gets to after having clicked on the email validation link. We would halt the registration process until Basset has set the status flag for the stage user in FreeIPA. But there again, if the spam check result is "manual", it could take hours or days until an admin approves, and the token link will have expired. We could have the page renew it regularly, but we can't really expect new users to stay on that page waiting for it to refresh. And that means another async roundtrip to email.

We have quite a few questions and changing the current workflow to integrate Basset is a significant change, that would also require us writing a Basset plugin.

Are we sure that we want to do that? If so, we'll need assistance from someone who knows how Basset works and how we could integrate with it. Otherwise we'll probably have to drop this feature (maybe in favor of a captcha? would it be sufficient?)

smooge commented 4 years ago

The problem that Basset was designed to deal with was that there are multiple 'companies' who just hire people to solve captchas for you. When we were getting inundated with wiki and other places spam, it was because there were about 40 people creating accounts over a 10 'work' hour day and then handing those accounts to another team which put it into several robots which filled out spam whereever the account was allowed in. While the people teams were able to get around captchas they usually followed certain patterns which basset could match:

  1. They used temp email services (there are various websites where it will give you an email account for 10 minutes. These are supposedly meant to deal with spam (I just need to buy this one thing and I don't want my info sold to spammers) but are mostly used by spammers to create accounts since the account only needs to be good for the initial email of a password back or some other link.
  2. They used patterns for account names. gilgamesh001 gilgamesh002, giglamesh003 (they were doing this by hand or their account prog threw those in to try and be smart because spelling issues like that would show up)
  3. They would use the same network space usually the same network ASN.

Basset would take these and some other items and 'judge' spam potential which cut down 1000+ accounts per day to 10 or so when basset was going. Whatever is done, there needs to be a way that accounts can be rapidly enabled or disabled by admins somewhere in the system.

abompard commented 4 years ago

Very interesting, thanks Smooge.

I think the ipa user-disable <username> command can be used to quickly disable a user (there's also an equivalent in the IPA UI). Do you mean it would be OK to disable accounts identifed as spam after the fact? That would be a simpler way forward for us: there could be a fedora-messaging listener that reacts on account creation messages, asks Basset to verify it, and then Basset would lock the account in IPA if the user is found spammish. We would still need to write the listenener, the Basset plugin, and display the information appropriately in Noggin so that users know what just happened, but at least it would not be inserted in the middle of the registration process. The downside is that accounts will be active unil found spammish. How is the "manual" result handled today? Is the user considered active or inactive until a manual decision has been made?

abompard commented 4 years ago

This is the registration process at the moment:

  1. New user enters desired username and email address
  2. Noggin creates a stage-user and generates a token. A stage user is not considered valid for authentication and can't have a password.
  3. Noggin emails token to user's address
  4. User clicks on token link
  5. User enters desired password
  6. Noggin activates the stage-user and sets the password. The user is now considered valid and can login in any app.
  7. User is redirected to the noggin front page and can log-in to change their settings or view groups.

I'd like to discuss and brainstorm where Basset could fit, especially with the "manual" outcome where an admin has to manually approve a new user.

nirik commented 4 years ago

Yeah, this is not easy. :(

Does noggin allow any email address? Or could we have a blocklist of domains?

Can we deactivate accounts that are spamming I trust?

I wonder if a limit on token processing would be useful. Ie, if 5 people are loading the 'click on token' link, tell the 6th-1000th that too many people are registering right now, please try again later? Or just throttle the number and sending time for the email with the token so only a few at a time would get it?

Or putting some browser intensive work on that link so their browser has to compute something before the token/set password screen?

Of course those would only slow spammers down.

I'm actually not a fan of the 'manual' outcome in basset anyhow, so I don't care that we preserve that, but there will always be people caught by spam measures that want their account activated/unmarked as spammers after the fact. ;(

Our mostly exposed sites here are the wiki and pagure. basset also watched those and when it saw spam would block the users (and then add that users info to it's 'spam' db).

abompard commented 4 years ago

Does noggin allow any email address? Or could we have a blocklist of domains?

We can totally have a domain blocklist, however it has to be in the configuration file because Noggin has no database (all the data is in IPA) and I don't believe IPA would have a way to store that. So changes to that list would require a restart of the pods.

Can we deactivate accounts that are spamming I trust?

Yes, that's built into IPA

I wonder if a limit on token processing would be useful. Ie, if 5 people are loading the 'click on token' link, tell the 6th-1000th that too many people are registering right now, please try again later? Or just throttle the number and sending time for the email with the token so only a few at a time would get it?

There would be a risk of cutting normal users out of the registration process during a spam attack, no?

Or putting some browser intensive work on that link so their browser has to compute something before the token/set password screen?

Oh, is this an effective spam mitigation technique? I don't think I've ever seen that (or I unknowingly did and blamed my browser slowdown on Firefox ;-) ). Plus I think that what Smooge was warning against is companies paying actual users to register on your website (and solve captchas while at it), so I don't think that would hinder them.

I'm actually not a fan of the 'manual' outcome in basset anyhow, so I don't care that we preserve that, but there will always be people caught by spam measures that want their account activated/unmarked as spammers after the fact. ;(

OK, so we should let them complete the registration process but lock them out of other applications maybe? And warn them that they have been identified as spammers so they can know what's going on and complain if it's wrong?

abompard commented 4 years ago

OK after thinking about it for a couple days and considering different options with their pros an cons, here's what I think we could do:

What do you think?

pypingou commented 4 years ago

If not, we'll have to write a small webapp for admins only that will list accounts pending manual approval and send emails when admins activate them. It's yet another webapp but it should be a small one, only for admins, and only a couple endpoints.

Can't this be part of noggin itself? It potentially doesn't need an UI, a CLI may be enough

abompard commented 4 years ago

Can't this be part of noggin itself? It potentially doesn't need an UI, a CLI may be enough

I think it's potientially dangerous to mix administrative features into the user self-service portal just for this small thing.

Of course it could also be entirely done with a script, but since it's probably already a painful process to approve those accounts I think we can make the effort of writing a web app. I think a script will be harder to use, it has to:

With a CLI that would be 3 separate commands that admins will need to remember, with a web UI we would have hyperlinks and buttons to make this more discoverable. But I'm not going to be the one using it so I'll do what admins prefer.

nirik commented 4 years ago

So, a few things/questions here...

If we do this plan, does IPA expose all those users? ie, would someone be able to make an account, have it in spamcheck_manual or spamcheck: failed and still be able to use their account via ipsilon? (kinit, openidc, etc)

In the past, spamcheck denied or spamcheck manual users would just mail and say 'I am not spam" and we would activate them/mark them as spamcheck ok, with the idea that anyone able to request that via email is a person. So, there's likely a bunch of people in spamcheck_manual that never bothered to mail us and are still in that state. ;( If we do this with this setup, we should probibly use a dedicated email, ie on that page that tells them they are in spamcheck_manual or spamcheck_denied say "if you feel you reached this in error, mail spamcheck@fedoraproject.org and explain that you are not spam"

So, this plan also assumes we bring back basset and interface it. Which we can do, but if we do this we need to try and bring it into our org and have someone learn about it and how to at least do releases, etc. :)

abompard commented 4 years ago

If we do this plan, does IPA expose all those users? ie, would someone be able to make an account, have it in spamcheck_manual or spamcheck: failed and still be able to use their account via ipsilon? (kinit, openidc, etc)

What I had in mind was to hide those users from fasjson and Ipsilon. I don't know whether IPA can be configured to prevent them from loggin in via Kerberos, maybe @tiran would know?

In the past, spamcheck denied or spamcheck manual users would just mail and say 'I am not spam" and we would activate them/mark them as spamcheck ok, with the idea that anyone able to request that via email is a person. So, there's likely a bunch of people in spamcheck_manual that never bothered to mail us and are still in that state. ;( If we do this with this setup, we should probibly use a dedicated email, ie on that page that tells them they are in spamcheck_manual or spamcheck_denied say "if you feel you reached this in error, mail spamcheck@fedoraproject.org and explain that you are not spam"

Will do.

So, this plan also assumes we bring back basset and interface it. Which we can do, but if we do this we need to try and bring it into our org and have someone learn about it and how to at least do releases, etc. :)

Since we need the service it provides, unless we have something better we'll have to bring it back and maintain it, yeah. I've sent a pull request to migrate it to Python 3 and I have started writing a plugin for noggin. If we pick it up there's going to be some work to do to make it easier to hack on (vagrant, linting, unit tests, etc). Also, it's storing its messages and decisions in MongoDB, that's a bit unfortunate, maybe in the long run we can migrate it to PostgreSQL, make it use our RabbitMQ cluster for message queuing, and run it in Openshift? If we pick it up we're picking a pack of technical debt with it but I don't see another way.

tiran commented 4 years ago

If we do this plan, does IPA expose all those users? ie, would someone be able to make an account, have it in spamcheck_manual or spamcheck: failed and still be able to use their account via ipsilon? (kinit, openidc, etc)

What I had in mind was to hide those users from fasjson and Ipsilon. I don't know whether IPA can be configured to prevent them from loggin in via Kerberos, maybe @tiran would know?

You can lock/disable a user with user-disable command. The command sets the nsAccountLock attribute and prevents LDAP bind and kinit.

abompard commented 4 years ago

OK, we still need to differentiate between users locked for spam reasons and users locked for other reasons but we can definitely set that field at the same time we set the other. Thanks.

tiran commented 4 years ago

I suggest a multi-valued attribute to track why a user has been locked. IIRC the user may also get locked by password policy plugin (too many failed logins).

abompard commented 4 years ago

Good point, should we make a new attribute or switch fasStatusNote to multi-valued? We're not using that field currently, as far as I can tell.

Also, I was hoping we could tell IPA to look at another field in addition to the nsAccountLock for Kerberos authentication, because we've had a feature request from a user who would like to be able to self-lock their account (for exemple during vacation) an self-enable it again when they get back. I thought of using the fasStatusNote field for that if we get around to implementing this feature. Do you have an advice on that, is there an IPA-supported way for users to lock their own account and unlock it later?

tiran commented 4 years ago

Not easily, you would have to write a 389-DS plugin in C and extend the KDC plugins in C to hook up an additional attribute.

abompard commented 4 years ago

Ouch, yeah let's avoid that.

abompard commented 4 years ago

Alright, there's a fatal flaw in this design: when a user tries to log in with a locked account, there's no way to tell the user why the account is locked (be is spam or otherwise). I guess I'm back to the drawing board, here's another way we could integrate Basset:

  1. New user enters desired username, first name, last name, and email address
  2. Noggin creates a stage-user (a stage user is not considered valid for authentication and can't have a password) and asks Basset to check it for spam likelihood. It displays a waiting page to the user telling them their account is being checked
  3. Basset calls back with the check decision. If it's manual or denied, the waiting page displays the appropriate message with a link to email admins on a specific address.
  4. If it's positive, Noggin creates a token and emails the token to user's address. The rest does not change.

What do you think?

nirik commented 3 years ago

That sounds workable. I am not sure what all inputs basset was considering when interfacing with fas... but we could start with this and try and add things if needed.

abompard commented 3 years ago

I'm not sure it's been captured here so I'll add a comment so we don't forget: setting to non-spam a user account that has been wrongly detected as spam is not a simple IPA operation that admins could do on the command line or in the IPA UI, since it needs to pick up the registration process where it was left off, namely by sending the address validation email. So we need to write a tool for admins to check accounts detected as spam and un-spam them. It could be a CLI but I think a web UI would be useful since they could go from one account to the next, review account details and click a "non-spam" button. A simple but useful UI.

On a side note: since it is not the user that initiates the "not a spammer" action, we can't expect them to be waiting for an email, so we can't use the current validation token that Noggin sends because it expires after 30 minutes and that's too short. So I think that this new UI for admins should send an email that would link the user to a noggin page allowing them to pick up the registration process with Noggin. Something that "Your account had been initially flagged as spam. We have approved it now, please click here to continue the account registration process". I think it'll address this expiration issue and also avoid code duplication because all the email validation code would stay in Noggin.

sfinn85 commented 3 years ago

Thanks for the update on this @abompard and sharing a possible solution here! Keen to hear what others think. Is this something @ryanlerch could help with?