NBISweden / LocalEGA

Please go to to https://github.com/EGA-archive/LocalEGA instead
Apache License 2.0
4 stars 1 forks source link

Database schema isolation and Security #366

Closed silverdaz closed 6 years ago

silverdaz commented 6 years ago

This PR improves the database in 3 ways:

All connection types (whether local or over the network) require a password (and we use stronger password hashing) and we enforce SSL communication.

Security can be further enhanced by restricting the connection type to a particular network, or user. (and of course, there is IPTables and other packet filtering). But for the moment, the latter is left aside.

silverdaz commented 6 years ago

Connected data-out issue #12

silverdaz commented 6 years ago

The two schema files audit.sql and main.sql should not be in the docker sub-tree. I think we should have a schema, db-schema or sql (or something) subdirectory in the root of the project for these things.

Yes, there were indeed in the top directory before, when we had the openstack deployment too.

Since it seems that we only support docker-based deployments, and that every site has written their own script to piece the things together, I have moved them back again in the docker tree.

After all, theses schemas are just a setup for that docker database, not an already existing database, part of a bigger infrastructure, with other network and users constraints... Only the code should somehow reflect the database structure/schema, and that's done with views.

viklund commented 6 years ago

Regarding sql-file location.

I think it is helpful for people using the project if a reasonable default schema can be found somewhere in the root of the project. I mean some of our application code actually depends on that there are certain functions in the database. If it is in the root, you can just document what are tables just for the test-instance to get it running and what are structures that are actually needed for the application to run.

silverdaz commented 6 years ago

What do you say about a readthedocs page with the description of the views? Maybe a good step towards those interfaces between components?

That page would not need to talk too much about the read-only user, or the underlying tables. (Or well, in a subsection, if someone is interested in it)

viklund commented 6 years ago

A readthedocs page about the datamodel and such would be excellent. But it's not a replacement for having the default SQL files it in the root of the project :).

silverdaz commented 6 years ago
blankdots commented 6 years ago

@silverdaz @viklund can we fix the conflicts and integration tests, in order for us to merge this?

blankdots commented 6 years ago

got a bit further:

ingest         | [lega.utils.db][DEBUG ] (L162) Updating status file_id 1 with "In progress"
ingest         | [lega.utils.db][ERROR ] (L252) Exception: <class 'psycopg2.IntegrityError'> in /usr/lib/python3.6/site-packages/lega/utils/db.py on line: 283
ingest         | [lega.utils.db][ERROR ] (L255) IntegrityError('insert or update on table "main" violates foreign key constraint "main_status_fkey"\nDETAIL:  Key (status)=(In progress) is not present in table "status".\n',) (from user: False)
ingest         | [lega.utils.db][DEBUG ] (L142) Setting error for 1: insert or update on table "main" violates foreign key constraint "main_status_fkey"
ingest         | DETAIL:  Key (status)=(In progress) is not present in table "status".
ingest         |  | Cause: None
ingest         | [lega.utils.db][DEBUG ] (L262) Catching error on file id: 1
silverdaz commented 6 years ago

Great job fixing the other bits! I leave it up to you to remove the Don't Merge tag and do merge.

blankdots commented 6 years ago

In order to merge this we need to decide on:

silverdaz commented 6 years ago

@blankdots: The last 2 bullet points are indeed not updated by this PR. You can do either way: Update now and merge, or make another PR later to update the part related to ebi.sql.

About the 2 first points: This entrypoint is related to a local deployment. You might indeed have your own database, with different requirements, and that is the job of the local instance to adjust it. This PR, in some sense, tells you what the minimum is.

Regarding version 10, I'm using it and I don't see any issue, so, please let me know, here or through another channel, what the issues are. Either we fix them in this PR, or we do it in a later one, since the integration seems to pass. After all, it's green, isn't it?