NationalGenomicsInfrastructure / ngi_pipeline

Code driving the production pipeline at SciLifeLab
6 stars 24 forks source link

Sarek integration #381

Closed b97pla closed 6 years ago

b97pla commented 6 years ago

This PR introduces the new best-practice analysis engine "Sarek".

Obviously, this is a huge PR so apologies for that. I will try to break it down a bit here and explain the structure of the code. Thanks to the engines being more or less "plugins", there shouldn't be any major impact on the existing pipeline code, so it should be relatively safe to merge.

The overall structure of the Sarek engine module follows e.g. the piper and rnaseq modules, having classes for handling launching, setting up analysis, tracking progress, talking to charon etc. The ambition has been to write code that have separation of concerns, favours short functions and is easy to test. I've also wanted to make it extendable. Whether this is the case remains to be seen but if design choices seem odd or convoluted, this is probably the explanation.

One addition is the "Connectors" (e.g. CharonConnector, ProcessConnector, TrackingConnector etc.) that are used to interface with the various databases and systems. This is to avoid having specific logic about the databases, cluster queues etc. scattered around the code and instead concentrating that to the corresponding connector class.

At this point, the engine can take an organised project and launch the analysis through Slurm. It will record the status in the local tracking database and sync it to Charon. When the update_charon_with_local_jobs_status.py script is run with the "-e sarek" argument, it will check sarek analyses tracked and update Charon accordingly (and remove an entry when analysis finishes).

Parsing of the results and updating Charon with them is not implemented yet.

b97pla commented 6 years ago

Thanks for the review! We could do either way but in the interest of not having pull requests hanging around, I'd vote for merging as well. Just let me fix the configuration issue above and then we should be good to go.

b97pla commented 6 years ago

Thanks! I've pushed a new commit and I'm happy to merge if you are 😄

senthil10 commented 6 years ago

@b97pla Awesome!! Thank you for the change, but looks like you missed a spot to change staging -> production. I will merge as soon as you fix that :)

b97pla commented 6 years ago

@senthil10 my bad, fixed now!