OpenWaterFoundation / owf-app-infomapper-builder-ng

Open Water Foundation web application to build an InfoMapper configuration
GNU General Public License v3.0
0 stars 0 forks source link

Cognito - Implement AWS Cognito for authentication #7

Open smalers opened 1 year ago

smalers commented 1 year ago

We need to implement authentication in order to allow certain users to edit InfoMapper configurations. Although our initial goal is public websites, we need to control who creates and edits them, and also who creates Organization and Personal accounts, Users, etc. Below are technical considerations:

  1. Do AWS Organizations get created first and then users and Cognito work with that?
  2. Are users separate from OWF AWS users? It would be best to keep OWF staff separate from external users.
  3. It may be necessary at some point to have private datasets, InfoMapper websites, etc. This is not the immediate goal and our focus is on public data. Private sites could complicate many things.
  4. The mechanics of logging into Cognito for the InfoMapper Builder needs to be clean and consistent with other applications, for example to remember the login on the device.
  5. Need to make recommendations for logins, such as using email for the login. This will standardize and simplify the system.
  6. Standard properties like name should be kept internally, and need to be careful with private data such as phone numbers.
  7. Avoid storing payment information

The immediate work is to figure out the mechanics of using Cognito to login and working with AWS Organizations.

Nightsphere commented 1 year ago

After research, some of the above points have been dealt with or are no longer necessary.

  1. AWS Organizations are not needed. Its main use is for large companies to help ease the load of configuring user authorization to many employees over many departments for AWS services. The user 'accounts' being used by Cognito are not full AWS accounts, but accounts created through the user pool.
  2. Yes, users are separate from AWS accounts in general. Steve and I (Josh) could have accounts in the user pool with more access attached to them. This could be done in different ways, depending on which option we go with for implementing accounts, which will be discussed below.
  3. Nothing to add here, but will keep in mind when deciding how to design the account handling.
  4. Using the AWS Amplify package has benefits, and one such is the built-in cookie-like behavior for an account through user authentication. There should be a way to change how this is handled using the Amplify SDK.
  5. When creating a new User Pool, this configuration is an option. More can be found in the Learn AWS Cognito section I added to. At the moment, both email and/or a user defined username can be used. Email/Username duplication and other checks are done through Cognito automatically. A phone number can also be used. All steps for setting up a User Pool can be found in the above Learn link.
  6. Using Cognito, we luckily don't need much more than an email if we want to set up the User Pool that way, and can skip phone numbers completely.
  7. I haven't looked into this much, but going through the User Pool UI, there seem to be some properties that deal with payments.

Multi-tenancy options

There seems to be 4 ways that Amazon suggests using, and I will list them here briefly. Here is the link to all 4 of them that go a bit more in-depth: https://docs.aws.amazon.com/cognito/latest/developerguide/multi-tenant-application-best-practices.html

  1. User-pool-based multi-tenancy - This is essentially the User Pool for every organization. It might be a good fit for some things we were wanting to implement, but bad in others. Of the 4 bulleted reasons why Amazon recommends this option over others, it seems like to me only the second bullet is what we would need. It would be easy to add a user in multiple "tenants" or companies (AKA a User Pool), but that might be it. Both this section and the app-client-based multi-tenancy section below have a field called Effort Level that describe the extra effort needed to make them work.
  2. App-client-based multi-tenancy - For this, in Amazon's words, you must implement tenant-matching logic and a user interface to match a user to the application client for their tenant. We'd need a way to differentiate each tenant in the app, maybe with a subdomain like "owf.infomapper-builder.org" and "another-company.infomapper-builder.org". The User Pool would require multiple app clients for each tenant, and implementation effort would also be high.
  3. Group-based multi-tenancy - This is what Elizabeth is using for the Dashboard. I believe Identity Pools are not even needed for this, because when a user authenticates using the User Pool, the user account in the pool will be placed in a user group that has a IAM role attached to it with its access set in the configuration file. This seems to be the least amount of effort to implement, which is maybe why Elizabeth picked it.
  4. Custom-attribute-based multi-tenancy - With this option, a custom attribute can be added to a user to determine its tenant. This would be done client-side, although I'm not completely sure how it would be implemented. All I can think is once the user is logged in, a check on the custom attribute set when the user signed up is performed, and depending on what it is, any calls to S3 would use the path that the user can access.

Steve can let me know what he thinks about these 4 options. Most seem to have good and bad things about them, but if I had to order them right now from best to worst it would be

3 - Best
1 - Better
4 - Good
2 - Moderate
smalers commented 1 year ago

You need to explain why you are ranking the way you do. I suggest that you put the 4 in a table across the top and then have rows for different criteria, with each cell having clear + and - indicators. My input is that I am Ok with a bit more complexity if it provides clarity, security, and other benefits.

Explaining vocabulary in documentation is important such as a tenant is an organization with each organization having one or more associated users.

A clear summary would also help such as "A user-pool-based multi-tenancy approach requires a user pool for each organization with more complexity due to defining the multiple user pools and policies." "An app-client-based multitenancy approach uses a single user pool for the application and the application enforces behaviors". The rows of the table might then indicate relevant information such as "Number of user pools" and "1 per organization" and "1 for all organizations".

The evaluation needs to be considered given our tenant types (organization with 1+ users, personal user, and community).

I am slightly leaning towards option 1 mainly because it gives flexibility and better separation of organizations. For example, we get into a situation where we need to provide a more complex hierarchy of roles.