lsst-uk / lasair-lsst

Apache License 2.0
0 stars 0 forks source link

Feature/cassandra schema #46

Closed genghisken closed 1 year ago

genghisken commented 1 year ago

Added a lasair_cassandra role create the schema in place of the gkansible.gkservercollection.cassandra_schema_lasair one (see deploy.yaml). The code runs shell commands rather than includes any special cassandra-specific galaxy modules, and must be run on ONE CASSANDRA NODE ONLY.

Note that the default git_branch variable (main) contains CQL syntax errors which are fixed in this branch.

The code is completely idempotent, since IF NOT EXISTS is also valid syntax in CQL.

No integration tests yet added, since this is dependent on setting up the Jenkins environment.

RoyWilliams commented 1 year ago

SO its OK to have IF NOT EXISTS in CQL? I thought we tried it and it was a syntax error?

RoyWilliams commented 1 year ago

I don't understand why the cql commands are pulled from git. Why not use the local versions lasair-lsst/common/schema/lasair_cql/diaObjects.cql

This should also be changed in deploy/lasair_database

gpfrancis commented 1 year ago

I think right now the issue with using local versions of the schemas is that there is no logic to check that the branch being deployed is the branch checked out - and this isn't a weird corner case, it's something we actually do all the time at the moment. In the long run I think we should change that, update our workflow and include the necessary logic, but that's really a different feature to this one.

genghisken commented 1 year ago

I did some experiments and IF NOT EXISTS is fine in CQL. The reason it failed previously is that CQL doesn't allow NOT NULL statements, one of which was still present in the diaObjects table (fixed in my branch).

Yes, I just followed the scheme in lasair_database for the CQL scripts, but I agree that it does seem very odd that we pull the SQL and CQL schema scripts from another branch. I would argue that the scripts shouldn't exist at all in the repository, and get created dynamically as part of the deployment process directly from the schema definition, so the schema is always kept up to date.