Repost of the security issues

yangm97 commented 5 years ago

Earlier today the attacker posted some insightful issues, but since Github has suspended their account, those are now gone. This is a repost.

GitHub issues of matrix.org pieced together as one "story":

I noticed in your blog post that you were talking about doing a postmortem and steps you need to take. As someone who is intimately familiar with your entire infrastructure, I thought I could help you out.

Complete compromise could have been avoided if developers were prohibited from using ForwardAgent yes or not using -A in their SSH commands. The flaws with agent forwarding are well documented.

Escalation could have been avoided if developers only had the access they absolutely required and did not have root access to all of the servers. I would like to take a moment to thank whichever developer forwarded their agent to Flywheel. Without you, none of this would have been possible.

Once I was in the network, a copy of your wiki really helped me out and I found that someone was forwarding 22226 to Flywheel. With jenkins access, this allowed me to add my own key to the host and make myself at home. There appeared to be no legitimate reason for this port forward, especially since jenkinstunnel was being used to establish the communication between Themis and Flywheel.

I was able to login to all servers via an internet address. There should be no good reason to have your management ports exposed to the entire internet. Consider restricting access to production to either a vpn or a bastion host.

On each host, I tried to avoid writing directly to authorized_keys, because after a thorough peak at matrix-ansible-private I realized that access could have been removed any time an employee added a new key or did something else to redeploy users. But sshd_config allowed me to keep keys in authorized_keys2 and not have to worry about ansible locking me out.

The internal-config repository contained sensitive data, and the whole repository was often cloned onto hosts and left there for long periods of time, even if most of the configs were not used on that host. Hosts should only have the configs necessary for them to function, and nothing else.

Kudos on using Passbolt. Things could have gotten real messy, otherwise.

Let's face it, I'm not a very sophisticated attacker. There was no crazy malware or rootkits. It was ssh agent forwarding and authorized_keys2, through and through. Well okay, and that jenkins 0ld-day. This could have been detected by better monitoring of log files and alerting on anomalous behavior. Compromise began well over a month ago, consider deploying an elastic stack and collecting logs centrally for your production environment.

There I was, just going about my business, looking for ways I could get higher levels of access and explore your network more, when I stumbled across GPG keys that were used for signing your debian packages. It gave me many nefarious ideas. I would recommend that you don't keep any signing keys on production hosts, and instead do all of your signing in a secure environment.

You thought there were 8, but now there are 9 (that's right, I see you watching me, I'm watching you, too). This is the last one, and I think it's the best advice I've got for you.

2FA is often touted as one of the best steps you can take for securing your servers, and for good reason! If you'd deployed google's free authenticator module (sudo apt install libpam-google-authenticator), I would have never been able to ssh into any of those servers.

Alternatively, for extra security, you could require yubikeys to access production infrastructure. Yubikeys are cool. Just make sure you don't leave it plugged in all the time, your hardware token doesn't do as much for you when it's always plugged in and ready for me to use.

Alternate-Alternatively, if you had used a 2FA solution like Duo, you could have gotten a push notification the first time I tried to ssh to any of your hosts, and you would have caught me on day one. I'm sure you can setup push notifications for watching google-authenticator attempts as well, which could have at least given you a heads up that something fishy was going on.

Anyways, that's all for now. I hope this series of issues has given you some good ideas for how to prevent this level of compromise in the future. Security doesn't work retroactively, but I believe in you and I think you'll come back from this even stronger than before.

Or at least, I hope so -- My own information is in this user table... jk, I use EFNet.

IMAGE 2019-04-12 13:58:02 IMAGE 2019-04-12 13:58:15 IMAGE 2019-04-12 13:58:21 IMAGE 2019-04-12 13:58:28 IMAGE 2019-04-12 13:58:55 IMAGE 2019-04-12 13:59:05 IMAGE 2019-04-12 13:59:15 IMAGE 2019-04-12 13:59:23 IMAGE 2019-04-12 13:59:30

EDIT: Add archive.org links:

ketzacoatl commented 5 years ago

Thank you for making this archive, all organizations can learn from this event and the statements shared with us. I'm not sure we would have seen/heard all of these details otherwise, but with these disclosures, we're all given an opportunity to be honest with ourselves, review our best practices, and revise/improve what we know should already have been changed a while ago.

Get help with those updates or changes if you need it, don't ignore the issues. Those nagging voices are there for a reason.

Kudos to the devs and sys admins hard at work to get things back in order, our thoughts are with you, it definitely isn't an experience any of us want.

But let us also realize: the times have seriously changed. We all need to up our game, significantly. If you aren't already thinking these thoughts, please reconsider your position of comfort before all your base are pwned.

BloodyIron commented 5 years ago

I'd like to point out that the Matrix.org group I think do not place security as valuable as they should : https://github.com/matrix-org/synapse/issues/4158

I know that they are busting their asses and trying to do as much as they can, including this and other security stuff, but when certain security issues (like above) are raised, and they don't get traction after months, it worries me.

I suspect this mentality is what lead to the original breach issue, as it sounds like nobody is doing any security auditing and asking "hey, why are we doing it this way? it's insecure". But I'm an outsider, and I can't be 100% sure.

I am a big fan of Matrix.org and Riot.im and all those people behind it. I want them to learn from this and hopefully plug some other serious security issues going on, because the world NEEDS Matrix.org and Riot.im. And if they don't learn from this, well that's just a modern tragedy.

hook-s3c commented 5 years ago

Did the attacker tamper with the javascript of the web app? I've asked in the main chat, but would like an official statement on this. This would allow to compromise encrypted messaging.

ara4n commented 5 years ago

they did not, based on everything we have seen so far in analysing their actions.

luckydonald commented 5 years ago

Could you use markdown quotes instead of code blocks to do the quoting? On mobile that means no automated linebreaks and thus a lot of scrolling sideways.

Edit: Thankies.

ara4n commented 5 years ago

(i've edited the original post as per above)

ilu33 commented 5 years ago

Was any of the identity servers affected? I can't find anything about vector.im infrastructure. Is this the right place to ask?

And thank you for being transparent about the issues.

nektro commented 5 years ago

@ilu33 https://matrix.org/blog/2019/04/11/security-incident/

Identity server data does not appear to have been compromised

emdete commented 5 years ago

can you please give more information:

how did the attacker get access in the first place (to use a forwarded agent it needs access to that box)?
when did the attacker get access (exact timestamp) and when was that access closed?
what data was accessed and/or compromised (did she sneak in code somewhere, did he access user data)?

thank you!

AfroThundr3007730 commented 5 years ago

can you please give more information:

@emdete All of that is covered in the blog post linked above.

ara4n commented 4 years ago

Everything in this thread (and more) was resolved months ago; as per the plan at https://matrix.org/blog/2019/05/08/post-mortem-and-remediations-for-apr-11-security-incident/. So, I’m closing this off.

matrix-org / matrix.org

Repost of the security issues #371