Closed silverdaz closed 6 years ago
Connected data-out issue #12
The two schema files
audit.sql
andmain.sql
should not be in thedocker
sub-tree. I think we should have aschema
,db-schema
orsql
(or something) subdirectory in the root of the project for these things.
Yes, there were indeed in the top directory before, when we had the openstack deployment too.
Since it seems that we only support docker-based deployments, and that every site has written their own script to piece the things together, I have moved them back again in the docker tree.
After all, theses schemas are just a setup for that docker database, not an already existing database, part of a bigger infrastructure, with other network and users constraints... Only the code should somehow reflect the database structure/schema, and that's done with views.
Regarding sql-file location.
I think it is helpful for people using the project if a reasonable default schema can be found somewhere in the root of the project. I mean some of our application code actually depends on that there are certain functions in the database. If it is in the root, you can just document what are tables just for the test-instance to get it running and what are structures that are actually needed for the application to run.
What do you say about a readthedocs page with the description of the views? Maybe a good step towards those interfaces between components?
That page would not need to talk too much about the read-only user, or the underlying tables. (Or well, in a subsection, if someone is interested in it)
A readthedocs page about the datamodel and such would be excellent. But it's not a replacement for having the default SQL files it in the root of the project :).
@silverdaz @viklund can we fix the conflicts and integration tests, in order for us to merge this?
got a bit further:
ingest | [lega.utils.db][DEBUG ] (L162) Updating status file_id 1 with "In progress"
ingest | [lega.utils.db][ERROR ] (L252) Exception: <class 'psycopg2.IntegrityError'> in /usr/lib/python3.6/site-packages/lega/utils/db.py on line: 283
ingest | [lega.utils.db][ERROR ] (L255) IntegrityError('insert or update on table "main" violates foreign key constraint "main_status_fkey"\nDETAIL: Key (status)=(In progress) is not present in table "status".\n',) (from user: False)
ingest | [lega.utils.db][DEBUG ] (L142) Setting error for 1: insert or update on table "main" violates foreign key constraint "main_status_fkey"
ingest | DETAIL: Key (status)=(In progress) is not present in table "status".
ingest | | Cause: None
ingest | [lega.utils.db][DEBUG ] (L262) Catching error on file id: 1
Great job fixing the other bits!
I leave it up to you to remove the Don't Merge
tag and do merge.
In order to merge this we need to decide on:
10
and we have it fixed in LocalEGA to 9.6
and in EGA Data API to 9.6
as well;ebi.sql
the download_log
and event
) also pointed here: https://github.com/EGA-archive/ega-data-api/issues/10#issuecomment-407743725@blankdots: The last 2 bullet points are indeed not updated by this PR. You can do either way: Update now and merge, or make another PR later to update the part related to ebi.sql.
About the 2 first points: This entrypoint is related to a local deployment. You might indeed have your own database, with different requirements, and that is the job of the local instance to adjust it. This PR, in some sense, tells you what the minimum is.
Regarding version 10, I'm using it and I don't see any issue, so, please let me know, here or through another channel, what the issues are. Either we fix them in this PR, or we do it in a later one, since the integration seems to pass. After all, it's green, isn't it?
This PR improves the database in 3 ways:
All connection types (whether local or over the network) require a password (and we use stronger password hashing) and we enforce SSL communication.
Security can be further enhanced by restricting the connection type to a particular network, or user. (and of course, there is IPTables and other packet filtering). But for the moment, the latter is left aside.