sillsdev / web-languageforge

Language Forge: Online Collaborative Dictionary Building on the Web and Phone.
https://languageforge.org
MIT License
44 stars 29 forks source link

secure mongodb with a password #1787

Closed hahn-kev closed 1 month ago

hahn-kev commented 4 months ago

currently mongodb does not require a password. This was ok when the system was on a single machine, but now that it's in k8s any other service in the cluster has access to the database, we should only allow access to the db if you can provide the password.

There are MONGO_INITDB_ROOT_USERNAME MONGO_INITDB_ROOT_PASSWORD environment variables. It's not clear to me what will happen to an existing database if you just add these variables, that needs to be tested before we go to production, we don't want to wipe out the db (unlikely) by just setting these.

rmunn commented 1 month ago

While MongoDB allows for complex authentication setups with role-based permissions, to make minimal changes to our existing PHP code we'll probably want to create an admin user and have Language Forge authenticate as the admin user. That way we won't have to mess with permissions when new databases are created for new projects, as the admin user will be able to do anything on any database.

rmunn commented 1 month ago

BTW, the answer to what happens on an existing database is that MONGO_INITDB_ROOT_USERNAME and MONGO_INITDB_ROOT_PASSWORD are ignored, as is the entire /docker-entrypoint-initdb.d/ directory. If you're adding auth to an existing database yourself, you're expected to create the admin database beforehand (in theory you could call it something else, but in practice it's probably best to stick with the de facto standard of having your auth database be named admin). You should then create a user as per the instructions above, and give it the root role, which is the equivalent of:

With a root user in place, you'll be able to create other users that "only" have readWriteAnyDatabase and userAdminAnyDatabase and use those for normal LF access. (Not granting dbAdmin and clusterAdmin means that that user won't, by default, be able to do things like dropDatabase... but if that's needed, userAdminAnyDatabase means that that user can grant any user, including themselves, the clusterAdmin role if needed).

Note: Our code might need to run dropDatabase when we do things like delete projects. If that's the case, we'll want to add clusterAdmin to the roles we give the admin user, as clusterAdmin is the role that includes dropDatabase access.

rmunn commented 1 month ago

Before adding auth, we'll want to go to our Language Forge deployment environments and create Kubernetes secrets as follows in each environment:

Then connect to Mongo in each environment (e.g.

use admin;
db.createUser(
  {
    user: "admin", // Or value of MONGODB_USER if we decide it should be different
    pwd: "CHANGE_ME", // Value of MONGODB_PASS
    roles: [
      { role: "userAdminAnyDatabase", db: "admin" },
      { role: "readWriteAnyDatabase", db: "admin" },
      { role: "clusterAdmin", db: "admin" }
    ]
  }
);

UPDATE: We'll also want to create a lexbox user with the role "readAnyDatabase", so that LexBox can make read-only queries safely. See https://github.com/sillsdev/languageforge-lexbox/issues/812 to track that. At some point I'll update the suggested Mongo query above to add that user as well.

I've been able to confirm that creating the admin database does not turn on authentication automatically; you need to edit the MongoDB config to turn auth on. So this is safe to do beforehand; then once that's in place, we can deploy a change to the PHP code that uses the MONGODB_USER and MONGODB_PASS variables when connecting to Mongo. Then we'll be all set to enable auth, and provide those k8s secrets to the PHP container (and LfMerge, and Lexbox, and...) as env vars.

rmunn commented 1 month ago

BTW, the answer to what happens on an existing database is that MONGO_INITDB_ROOT_USERNAME and MONGO_INITDB_ROOT_PASSWORD are ignored...

Oddly enough, this isn't quite true. The Docker image docs say "Do note that none of the variables below will have any effect if you start the container with a data directory that already contains a database", but in fact MONGO_INITDB_ROOT_USERNAME and MONGO_INITDB_ROOT_PASSWORD do have an effect. Their values are ignored if you have an existing database, but their mere presence is enough to add the --auth parameter to the invocation of mongod. Here's the source line that does this:

https://www.github.com/docker-library/mongo/blob/756665118c57b14ae264e7a6431acf221a5b8a38/6.0/docker-entrypoint.sh#L258

So once we have created our admin user, we'll be able to enable auth on the MongoDB pod by simply adding the MONGO_INITDB_ROOT_USERNAME and MONGO_INITDB_ROOT_PASSWORD environment variables to it. With any value, though we might as well pull their values from the mongo-auth secret the same way we pull the MONGODB_USER and MONGODB_PASS values.

myieye commented 1 month ago

@rmunn Could you maybe create the appropriate issue in lexbox for authenticating? It might be nice to have our own user for Lexbox, because all we do is read.

rmunn commented 1 month ago

@myieye - Done, https://github.com/sillsdev/languageforge-lexbox/issues/812. And I updated https://github.com/sillsdev/web-languageforge/issues/1787#issuecomment-2119674540 to note the need to create two users, an admin user that can do anything and a lexbox user that has read-only access to all databases.

rmunn commented 1 month ago

I've created k8s secrets in LF staging and prod (called mongo-auth in both) containing the MONGODB_AUTHSOURCE, MONGODB_USER, and MONGODB_PASS values, and I've created a database table in both environments containing those values as well. It should be safe to merge this now. And since it's been approved, I'll merge it.