refgenie / refgenconf

A Python object for standardized reference genome assets.
http://refgenie.databio.org
BSD 2-Clause "Simplified" License
3 stars 6 forks source link

Database back-end for refgenieserver configuration #119

Open nsheff opened 3 years ago

nsheff commented 3 years ago

Crazy idea, but it would be useful, at least for a server, if refgenie could be backed by a database rather than just a yaml file... and now this is starting to sound like pipestat. So, what if the refgenie config were handled somehow by pipestat and could be either a yaml file or a DB?

Ok this isn't straightforward, because the pipestat config specification clearly isn't the same as the refgenie config. So maybe this isn't possible directly. But if it were possible, maybe there are 2 ways to think about it:

@stolarczyk what do you think? Any possibility here?

One major benefit we'd have is the ability to farm out the builds, since the updates would happen to a remote DB rather than to a shared local config, which is problematic for ephemeral computing. So, we'll need some type of remote DB for the server config to enable fully automated refgenie server deploys from github.

stolarczyk commented 3 years ago

this would be really useful, indeed. We did talk about it a long time ago and I even started experimenting a couple of months ago with refgenieserver and added some postgres DB models: https://github.com/refgenie/refgenieserver/commit/b7401316777086f66cbd600604f37423645094c4

Ok this isn't straightforward, because the pipestat config specification clearly isn't the same as the refgenie config

Exactly, pipestat config is used to configure the database, namespaces, record identifier, etc. But pipestat YAML results file is similar to refgenie config. However, I'm afraid that making RefGenConf extend the current PipestatManager would only complicate things. I'll keep this in mind when we get to redesigning pipestat to use ORM approach. We will probably need to make pipestat more flexible, as you suggest.

If connecting pipestat and refgenconf doesn't work out we could at least apply the concepts learnt there to add the refgenconf DB backend option.

nsheff commented 3 years ago

If connecting pipestat and refgenconf doesn't work out we could at least apply the concepts learnt there to add the refgenconf DB backend option.

Bingo -- I think this is the way forward. Probably doesn't make sense to merge these ideas, but we will definitely benefit from the experience.