sciai-lab / batchlib

Batch procesing for high-throughput screening data
MIT License
3 stars 0 forks source link

Automating analysis workflow #49

Closed constantinpape closed 4 years ago

constantinpape commented 4 years ago

I have been thinking a bit about how to automate running the analysis pipeline. I think the ideal setup would be setting up a cronjob that does the following:

Obviously, this will only work if we can set up rsync to work with windows at EMBL; I will try to do this with EMBL IT next week. Some other things we would need:

wolny commented 4 years ago

Hey @constantinpape, the setup you described should definitely be executable, however I'd rather choose a different trigger for starting the pipeline than the cronjob/rsync. I'd rather rely on something sending the data to EMBL premises, where we have a watchdog process (it can be a cronjob), which spawns the pipeline process (point 3 onward) when a new data shows up. It's not a big change to the setup you described, but would allow us to make the processing on EMBL side bulletproof without worrying about the cronjob/rsync connection not working properly. At the beginning the data can be sent manually and we would see this nice slack messages right away.

constantinpape commented 4 years ago

I'd rather rely on something sending the data to EMBL premises, where we have a watchdog process (it can be a cronjob), which spawns the pipeline process (point 3 onward) when a new data shows up. It's not a big change to the setup you described, but would allow us to make the processing on EMBL side bulletproof without worrying about the cronjob/rsync connection not working properly. At the beginning the data can be sent manually and we would see this nice slack messages right away.

Yeah that's a good point and doesn't change much in terms of setting up the system. @imagirom and me are working on getting the whole code more deployable now and then we can sync on what to do here exactly in the next few days; also with the database solution we eventually want to have in mind.

constantinpape commented 4 years ago

Working heavily on cleaning up the code and making it run smoother in #51. Figuring out the automatic deployment is not so high priority, because the number of new plates in the next 1 or 2 weeks will still be manageable with manually taking care of running jobs.