af-lab / histone-catalogue

Core histone catalogue --- Live manuscript
1 stars 0 forks source link

continuous delivery #25

Open carandraug opened 8 years ago

carandraug commented 8 years ago

Andrew to investigate with the provider to set up buildbot on chromosome.ie. The buildotbot slave would be af-lab.

Alternatively, it will be easier to just have af-lab ssh into chromoseom.ie and push the webpage and pdf.

aflaus commented 8 years ago

OK, can we clarify the builedbot strategy:

  1. My Rochen hosting package would need to have a buildbot master
  2. The buildbot master would call a buildbot slave on af-lab Isn't there a problem that af-lab is not public and only access is via vpn. NUIG are not going to be pleased to see vpn credentials cached on a public machine ...

Sounds like sftp of static files from af-lab to the public rochen hosting web server is much easier?

carandraug commented 8 years ago

While the buildmaster sends connections to the buildslave, it's the buildslave that start the connections. The buildmaster will then use the connection that the buildslave created, so the buildslave can be behind a NAT/firewall.

The buildslaves are typically run on a variety of separate machines, at least one per platform of interest. These machines connect to the buildmaster over a TCP connection to a publically-visible port. As a result, the buildslaves can live behind a NAT box or similar firewalls, as long as they can get to buildmaster. The TCP connections are initiated by the buildslave and accepted by the buildmaster, but commands and results travel both ways within this connection. The buildmaster is always in charge, so all commands travel exclusively from the buildmaster to the buildslave.

aflaus commented 8 years ago

How are the messages sent? I think the buildbot slave on af-lab could only pick them up by querying the buildbot-master periodically to ask for instructions.

The hosting support people recommended rsync for af-lab to upload to the chromosome.ie virtual server.

What do you think?

carandraug commented 8 years ago

I don't know the details. There's more on the buildbot manual. I have asked on IRC at #buildbot and they confirmed what understood from the documentation. That the slaves can be behind a firewall, that it won't be an issue.

I think that we want a real system for continuous delivery, and not some half-baked solution written by us in a week. buildbot will easily (or at least in easier than coding ourselves):

  1. builds with alternative build options (think other organisms)
  2. storage of older builds (think accessing the raw data anytime in the last 2 years for comparison)
  3. emails as soon as the build is broken
  4. continuous integration (instead of pushing and break the build all the time for the other, it creates a pull request and refuses to merge if it breaks the build)

If af-lab being behind a firewall is a problem, ask NUIG to sort something out. It's their job. If that fails, I will find a public facing buildslave.

carandraug commented 8 years ago

How is this going? Any progress on setting up a build master on the ccb website?

aflaus commented 8 years ago

I think the builedbot route is going to be too complex, at least unless I had a virtual server instance and not simply a virtual web server.

For the moment, I think we should simply run a cron job that clean rebuilds the manuscript monthly and pushes it by scp/sftp to the CCB virtual web server. Even I can manage that! Is this ok with you?

carandraug commented 7 years ago

Yes, please go ahead.

aflaus commented 7 years ago

I can get an external facing VM with ubuntu. I would like to admin this myself so I can understand and document things, but I will absolutely need you help to step through it. Can you give me a quick list of packages to install?

As you are aware, I know a decent amount about Wordpress and am comfortable with python. I know enough PHP to get by. Do you recommend a framework or CMS for publishing?

aflaus commented 7 years ago

VM with ubuntu requested, should be created before Christmas.

carandraug commented 7 years ago

I can get an external facing VM with ubuntu. I would like to admin this myself so I can understand and document things, but I will absolutely need you help to step through it. Can you give me a quick list of packages to install?

The README.md file has a section for dependencies. If you run scons -h you will get a more detailed list. Then, as you run scons to actually build the manuscript, it will check for all the dependencies and give you an error message with which ones are missing.

As you are aware, I know a decent amount about Wordpress and am comfortable with python. I know enough PHP to get by. Do you recommend a framework or CMS for publishing?

Not really, I only have minimal experience with those things. Everyone seems to be moving towards static page generators though but I have no personal experience with any of those either.

aflaus commented 7 years ago

Advice: I can see how to run a static page generator like Jekyll (or Pelican in python) is fairly easily. I could then simply use the VM inside the NUIG network to generate both the static html (from markdown) and run the manuscript generator, then publish it GitHub Pages for the project. Is that what you would recommend?

carandraug commented 7 years ago

Something like that yes, tough if the plan is to use chromosome.ie so you shouldn't need github pages at all.

What are you planning to have as content of the webpages? If the content doesn't change, you can even write the html only once and forget all the other complication. I thought your suggestion was to do that on chromosome.ie and push only the catalogue build.

carandraug commented 7 years ago

I can get an external facing VM with ubuntu. I would like to admin this myself so I can understand and document things, but I will absolutely need you help to step through it. Can you give me a quick list of packages to install?

The README.md file has a section for dependencies. If you run scons -h you will get a more detailed list. Then, as you run scons to actually build the manuscript, it will check for all the dependencies and give you an error message with which ones are missing.

How is this going? I want to make sure people can easily build this themselves. If you have any issue with the instructions, can you open a new issue to address it?

aflaus commented 7 years ago

The provisioned a VM this afternoon. However, I cannot ssh to it. There seems to be a problem with the user/pass.

I'm still undecided about publishing mechanism. Simplest is to ftp to chromosome.ie and author the page there. Holding markdown in GitHub project and publishing via Jekyll and GitHub Pages seems more elegant and independent, but also more work (incl domain). Opinions?

carandraug commented 7 years ago

I'm still undecided about publishing mechanism. Simplest is to ftp to chromosome.ie and author the page there. Holding markdown in GitHub project and publishing via Jekyll and GitHub Pages seems more elegant and independent, but also more work (incl domain). Opinions?

I am still unclear what is the purpose of the website and what will be its content. The original proposal of this issue was to have a continuous build system where one can see the build status over time, the stdout and stderr of each, and download the results from each one. It would even support builds with different options (at the moment, only different organisms), and email someone when the build fails.

See for example, the buildbot for buildbot itself. It has a configurable start page. You can see a list of builders which on our case would be builds for different organisms and other future combinations if we ever get more options with immediate download access to the latest build. Choosing one of the builders, you can scroll to the bottom to get a list of previous build from that builder. This would be how to download older catalogue versions.

I accept that none of us has the time to set that up but if you're going down the road of setting up a continuous build for the website content, then it might be more productive to use it to set up buildbot instead.

So what will be the website content? Would a single static webpage with an explanation of the project, do it? That page could have a link to a directory of dated tarballs for download. I thought that was the plan once we scratched buildbot. That sounds good to me.

aflaus commented 7 years ago

In reality the average biochemist or cell biologist is not going to build their own manuscript. I was wanting to make the current built manuscript directly available (ideally also archived versions) online, plus the sequence files and tables of IDs because these are very handy.

You make a very good point that we must not undermine the aim to create an intelligible manuscript by making yet another manually curated, unsustainable resource website.

I have begun asking around to find an IT-competent summer student to develop an adaptation around the idea of a shorter summary "organism note" that is more generic and automated. This tests the manuscript v database argument even further. It would be nicely implemented with your buildbot strategy.

carandraug commented 7 years ago

Then seems like you only need a cronjob that makes a whole new build and push the results, which are all tarballs and pdf's, to the web server. And on the web server you only need a single histone-catalogue.html page with links to the directories where those files end up. It might be a good idea to have a symlink to the last build so you can have links to it on the webpage instead of links to the directory only.

Please note that it is important to also store the stderr of the build because if there's an issue with a gene it may be skipped on the analysis, and depending on the nature of the issue may not even appear on the list of anomalies. It will, however, be noted on the stderr.