mathesar-foundation / mathesar

Web application providing an intuitive user experience to databases.
https://mathesar.org/
GNU General Public License v3.0
2.36k stars 323 forks source link

Support Multiple Database Servers, supply db and creds via params #2716

Open MarkCupitt opened 1 year ago

MarkCupitt commented 1 year ago

Problem

We run a Kubernetes Admin cluster that maintains a number of child clusters, each with their own postgres databases.

As we understand it, mathesar is deployed on a one -> one relationship per database.

Proposed solution

IT would be very useful to be able to select which server one wanted to connect to, opr even better, pass in the server required of a single running matheus instance, eg: via Header, querystring or whatever.

There user/password should be bale to be (optionally), we understand the security issues) be able to be included, along with a specific database to access. Obviously we could match read/only creds to a database or read/write depending on the users role

rajatvijay commented 1 year ago

@mathemancer @kgodey can you take a look?

kgodey commented 1 year ago

@MarkCupitt We already have some support for this (you can configure any number of databases to connect to Mathesar). Our existing permissions system should be able to handle users with different levels of permissions to different databases.

The main things missing are documenting how to set up multiple databases and UI to switch between databases. We're tracking this in #2006.

Do you think that covers your use case or is there anything missing?

MarkCupitt commented 1 year ago

Hi @kgodey I think it will go close so let me describe a perfect world scenario from a laymans standpoint

We are using Teleport to protect our operational and production systems, an it would probably be a very good idea if you could support it. Teleport is a secure proxy, that implements RBAC to determine who gets to see whats behind it.

When it calls an application, it provides a [ RFD ] JWT in a header called teleport-jwt-assertion which allows for a good level of user cred integration and role propagation

We would love to leverage this capability to extend roles and permissions into mathesar. As you are very aware, database security is very difficult to implement effectively without handing out passwords, so the ability to map a role supplied by teleport to a database user, schema and password to access the db would be incredibly useful as long as it was secure and encrypted in storage

We would see us connecting to a single mathesar instance, and a list of databases that the users RBAC gave them permissions to would display, along with a RW RO OWNER indicator, clicking on the database would switch to what is there now, which is super by the way

It could also use the logged in teleport user as the database user, but in our case, we want to go with generic users controlled by rbac to manage access

A use case like above would be incredibly useful, an may generate a lot of interest among the community.

We are very concerned at using a UI like PGadmin 4 that requires us to hand out creds before it can be used, it makes cred management super hard, and we identify that issue as a strategic weakness we have to address and would dearly love a web based UI we could securely integrate in our Kubernetes Admin Clusters

MarkCupitt commented 1 year ago

To add a further very useful capability, being able to extract the creds from something like Hashicorp vault (it has an api) would be VERY useful, as it would then allow for automatic password rotations by the vault and secure, enterprise password storage for your application as well .. Vault is very popular and we use it for a lot

kgodey commented 1 year ago

Thanks @MarkCupitt, this is very helpful. This seems like a cross between UI for multiple databases and SSO support, which is on our roadmap. I'll need do some thinking and research about how best to integrate this into Mathesar.

seancolsen commented 7 months ago

@MarkCupitt I want to follow up with you about this feature request.

Mathesar 0.1.4 (released last week) now has a new feature that allows you to configure multiple connections to separate databases. You can see some screenshots of this feature within our release notes.

Would you be interesting in taking a look at this new feature and letting us know how closely it matches the needs you've described here?

MarkCupitt commented 7 months ago

@seancolsen Many thanks, in fact, very timely, I will be upgrading that specific operational system over next two weeks, so I will revisit the install and give you some feedback on how it fits for us ..

MarkCupitt commented 7 months ago

@seancolsen has there been any attempts that you are aware of to support mathesar in Kubernetes?

MarkCupitt commented 7 months ago

Ah, just found this, we wil see if we can add some value to it

https://github.com/mathesar-foundation/mathesar/issues/2707

MarkCupitt commented 7 months ago

@seancolsen I took look at this and it goes a lot of the way to what we need.

In our use case, we manage multiple databases, long term we are expecting to have 50+ databases we have to maintain and provide different levels of access to. Teleport allows us to issue short term creds for access for contractors and the like, on demand, the contractor could be a different person each time ..

We have two problems:

1) restrict user access to different databases based on roles, eg Read Only, and so on. Some users require RW or Owner Privileges to some databases, and not on others

Integrating these user roles with an SSO like teleport, where roles can be managed centrally.

We don't really want to have to manage two systems, with different logins, that will get very unwieldy for us quite quickly, and leaves a risk that something falls through the cracks when a user leaves or changes status

My initial Issue on this is at https://github.com/mathesar-foundation/mathesar/issues/2716

We will deploy 0.1.4 shortly and give it a test drive, and pass back additional thoughts

Thanks for the work, it looks great

MarkCupitt commented 7 months ago

@seancolsen I have created a Kubernetes Deployment version of Mathesar, I removed Caddy from the mix, as its was not needed in our scenario, and the caddy file pointed to a reverse proxy that I named differently, and I am happily able to access the UI

4 things of note so far:

1):

I noticed somewhere that it was planned to remove the env var MATHESAR_DATABASES we found this extremely useful in a k8s environmentg, as it lets us populate databases as part of our ci/cd when we deploy a new one. Please do nto remove it, or even better, allow us to mount a yaml file describing the databases like this

MATHESAR_DATABASES:
- "eng-o2/bounty|postgres://postgres:DO1c@{{postgres_master_pooler_url}}/bounty?sslmode=disable"
- "eng-o2/prometheus|postgres://postgres:DO1c@{{postgres_master_pooler_url}}/prometheus?sslmode=disable"

And use the ENV VAR to point to the location of the yaml file in the file system. This should also work great with docker as well. We will have 50+ databases and I am not keen on having to enter them all manually in every instance of Mathesar we deploy

2):

I am unable to access the user page in the admin section

image

The logs indicate a migration problem, here is what we got. (The image was the latest in your dockerhub repo)

2024-02-28 02:13:21 +0000] [191] [INFO] Starting gunicorn 20.1.0
[2024-02-28 02:13:21 +0000] [191] [INFO] Listening at: http://0.0.0.0:8000 (191)
[2024-02-28 02:13:21 +0000] [191] [INFO] Using worker: sync
[2024-02-28 02:13:21 +0000] [192] [INFO] Booting worker with pid: 192
/code/db/tables/operations/select.py:20: SAWarning: Did not recognize type 'regtype' of column 'column_type'
  return Table(
/code/db/tables/operations/select.py:20: SAWarning: Did not recognize type 'regrole' of column 'owner'
  return Table(
/code/db/tables/operations/select.py:20: SAWarning: Did not recognize type 'regtype' of column 'column_type'
  return Table(
/code/db/tables/operations/select.py:20: SAWarning: Did not recognize type 'regtype' of column 'column_type'
  return Table(
/code/db/tables/operations/select.py:20: SAWarning: Did not recognize type 'regrole' of column 'owner'
  return Table(

3):

Default Admin Password, has ot be set at first login, we would like to be able to pass this as ENV Vars, so we can manage the creds via our secrets in Kubernetes. eg: ADMIN_USERNAME and ADMIN_PASSWORD

4):

Users, not critical, but VERY helpful, again to allow us to manage user creds and roles declaratively by K8s secrets. Perhaps using a YAML file like this

users:
  - name: admin
    password: pass
    type: admin
  - name: joe
    password: pass
    type: standard

It would also be nice to be able to declaratively map the users to the roles on a per database basis as the UI does, but this is likely a big ask

If there was some sql we could execute to achieve above, that would also work


Other than the bug, the system is quite useable and the requests will make it very useful in a K8s environment.

If we can get above sorted, Ill be quite happy to make our deployment manifests available for you guys to publish, Ill need to tidy up a bit, but I think it will be a viable

MarkCupitt commented 7 months ago

@seancolsen Result form attempt to perform the migration manually in the container, looks the models amy have an issue or similar

python manage.py makemigrations
Migrations for 'mathesar':
  mathesar/migrations/0007_auto_20240228_0338.py
    - Alter field column_order on tablesettings
root@mathesar-6bdc4fdb87-hlpx9:/code# python manage.py migrate
Operations to perform:
  Apply all migrations: admin, auth, contenttypes, mathesar, sessions
Running migrations:
  Applying mathesar.0007_auto_20240228_0338... OK
/code/db/tables/operations/select.py:20: SAWarning: Did not recognize type 'regtype' of column 'column_type'
  return Table(
/code/db/tables/operations/select.py:20: SAWarning: Did not recognize type 'regrole' of column 'owner'
  return Table(
/code/db/tables/operations/select.py:20: SAWarning: Did not recognize type 'regtype' of column 'column_type'
  return Table(
/code/db/tables/operations/select.py:20: SAWarning: Did not recognize type 'regtype' of column 'column_type'
  return Table(
/code/db/tables/operations/select.py:20: SAWarning: Did not recognize type 'regrole' of column 'owner'
  return Table(
MarkCupitt commented 7 months ago

@seancolsen Im getting the XHR request response not valid JSON error on multiple pages

seancolsen commented 7 months ago

Thanks for all this feedback, @MarkCupitt! We will have a maintainer look into this and see how we can help you out. I can't promise we'll be able to respond this week though, as we have multiple maintainers out in order to attend a PostgreSQL conference.

MarkCupitt commented 7 months ago

thanks for your reply, anything I can do to help, please let me know. I'll clean up the manifests, and when it's all good, I'll make them available to you guys

On Wed, 28 Feb 2024, 8:56 pm Sean Colsen, @.***> wrote:

Thanks for all this feedback, @MarkCupitt https://github.com/MarkCupitt! We will have a maintainer look into this and see how we can help you out. I can't promise we'll be able to respond this week though, as we have multiple maintainers out in order to attend a PostgreSQL conference.

— Reply to this email directly, view it on GitHub https://github.com/mathesar-foundation/mathesar/issues/2716#issuecomment-1968928158, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABJ7RUEO3BHCGPS2WJ2ZNCLYV4SQDAVCNFSM6AAAAAAWAOOYQ6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRYHEZDQMJVHA . You are receiving this because you were mentioned.Message ID: @.***>

mathemancer commented 7 months ago

@MarkCupitt It would help me to diagnose this to know which versions of PostgreSQL DB servers you have hooked up to Mathesar right now. Are any of them < 13 or > 15?

Edit: Also, it would help if you have any more logs showing the problem. It seems like Mathesar is attempting to reflect some view or table (but given the types, I expect a view) that uses system catalog types which aren't supported by one of our dependencies (SQLAlchemy).

Finally, does the warning occur when a request is made, or just in the process of loading Mathesar? It seems like the latter from the logs you posted, but I'd like to confirm.

MarkCupitt commented 7 months ago

I tried it. on 13. and 15, same result, sorry, I should have included that

On Wed, 28 Feb 2024, 10:56 pm Brent Moran, @.***> wrote:

@MarkCupitt https://github.com/MarkCupitt It would help me to diagnose this to know which versions of PostgreSQL DB servers you have hooked up to Mathesar right now. Are any of them < 13 or > 15?

— Reply to this email directly, view it on GitHub https://github.com/mathesar-foundation/mathesar/issues/2716#issuecomment-1969162796, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABJ7RUAUYKJVVMV552XZVEDYV5ASDAVCNFSM6AAAAAAWAOOYQ6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRZGE3DENZZGY . You are receiving this because you were mentioned.Message ID: @.***>

mathemancer commented 7 months ago

I had a couple more questions, but added them as an edit to my previous comment. I'm not sure how that works with the Github email comment setup, so I'll make them a separate comment to be sure.

It would help if you have any more logs showing the problem. It seems like Mathesar is attempting to reflect some view or table (but given the types, I expect a view) that uses system catalog types which aren't supported by one of our dependencies (SQLAlchemy).

Finally, does the warning occur when a request is made, or just in the process of loading Mathesar? It seems like the latter from the logs you posted, but I'd like to confirm.

MarkCupitt commented 7 months ago

@mathemancer I'm afraid that those logs are all there is, the logging output is pretty much zero. I tried to figure out how to enable debug and request logging but failed miserably... I can get a shell into the container, if you have some steps for me on how to get you better logs, I will be very happy to do it for you. I do have Django experience, but It was 10 years ago .. .

The warning is on UI requests, it's not every page, but it appears random. If I refresh the page enough times I can often get in, for example, when adding users, I set up a user, and on the next page load, the error occurs, but attempting to navigate back in, on several attempts I was able to get the page to load again... I can not see any pattern to it that makes sense to me

More than happy to perform whatever test you need to help out, just let me know, just so you know, I am GMT+8

mathemancer commented 7 months ago

@MarkCupitt Do you mind if I email you? We could try to set up a call to debug early next week if that works for you. It might be the most efficient way forward.

MarkCupitt commented 7 months ago

No problem at all, please do .. we can do a screen share session if that helps, I see we are both GMT+8, I have sent an email to connect us via your "hello" generic email address

kgodey commented 7 months ago

@MarkCupitt We got your email and I sent you a response with some availability. Thanks!

seancolsen commented 4 months ago

Based on skimming this thread and chatting privately with @kgodey, here's my understanding of the next steps:

  1. We need to finish our backend refactor (which at this point probably means releasing Mathesar's beta version).

  2. Then we need to look into what it would take to implement this issue. This will entail some assessment of requirements, design, and implementation.

As such, I've re-labelled this issue so that we can come back to it later.

kgodey commented 4 months ago

Thanks @seancolsen.