BioContainers / specs

BioContainers specifications
http://biocontainers.pro
Apache License 2.0
49 stars 12 forks source link

BioDocker Future Ideas and Implementations #59

Closed ypriverol closed 3 years ago

ypriverol commented 8 years ago

Hi all: As you may know the BioDocker paper is now under review in MCP and We hope that will be back soon with good news. In the meantime I would like to open this issue to discuss a couple of ideas that I would like to move forward with you. These ideas resume previous discussions with @prvst @bgruening @pcm32 @rajido and other members of the project:

  1. Re-brand of BioDocker to BioContainers, in the new future we would like to explore other technologies and not only docker containers, this basically will open the number of technologies and sub-projects within BioDocker. My idea is to keep all the domains, repositories for Biodocker but if you are agree we can plan this migration. In the future "BioContainers" will provide flavor containers in different technologies like docker, or rkt (https://medium.com/@adriaandejonge/moving-from-docker-to-rkt-310dc9aec938#.feg01yiqo), with bioconda packages, etc. BioDocker will be more a community-umbrella to provide the discussion environment, infrastructure and coordination of all of this packages. If we agree with this I can take the lead responsibility to plan the migration.
  2. Deploy of BioDcoker containers in multiple places. We have been suffering with the performance of docker hub. Also, other repositories are getting really popular like https://quay.io/. Also some initiatives are interested to host our containers like Phenomenal @pcm32 . We have a previous issue #57 . I would like to see some taking the Lead on this responsibility like how to deploy automatically in multiple places, etc. From the top of my head @bgruening can help on this.
  3. Containers Testing: We have been discussing a lot about Testing of our containers, How to test them and which kind of guidelines we can define to automatically test our containers, continues testing etc. @sauloal was exploring some ideas before but still we don't have a clear idea and solution. I was talking with @pcm32 and he would like to open a couple of issues and re-open the discussion.
  4. External contribution: We need to see how to promote, and make easy to other developers to embrace BioDocker and contribute with BioDocker. At the moment of writing we have 35 docker containers. We can not scale the current way of generating a container because is basically a personal efforts of @prvst and myself. I would like to open a discussion about how to make biodocker project more scalable, etc. Probably, @rajido has some ideas about how they did it in BioJS project.

Regards and Hope you can contribute with this ideas.

pcm32 commented 8 years ago

Thanks for starting this @ypriverol

I agree on point 1.

Regarding point 2, I'll soon start testing an object store backed by the EBI for our current private (for pushing) docker registry. If this works well, and I manage then to have multiple appliances, load balanced, running the registry v2 (well v2.5) against that storage, I would presume that we would be at least at the technical level of being able to provide registry services for biodocker (@ypriverol spoke about hundreds of images, large number of pulls). Of course, this last thing actually happening (EBI hosting registry for biodocker through us) would depend on people above my pay grade, but at least the technical solution would be there ready.

I have opened a separate thread regarding point 3.

bgruening commented 8 years ago

@ypriverol thanks for putting this together. Here are a few more details about the recent work in this area: https://github.com/bioconda/bioconda-recipes/issues/2297

As said I'm happy to move this entire project over to the biodocker project.

Thanks again @ypriverol for putting this together and to drive this community!

rajido commented 7 years ago

Sorry for the late reply. I just wanted to share with you few ideas on promotion of Biodocker and increasing external contribution (@ypriverol point number 4).

Training materials

I think good training materials about how to create and use biodocker containers could help trainers to promote BioDocker. Training materials can be easily promoted in www.mygoblet.org and www.tess.elixir-uk.org. Let's create training materials.

Biodocker for training purposes

To have biodocker containers for training purposes could help training teams to better deliver training. Most of the training courses need specific tools, scripts, datasets, tutorials, ... preinstalled. Let’s engage with training coordinators to provide them with Biodocker containers for training purposes, specially for those popular courses which are run several times in several places.

Use cases

Who is the target user of Biodocker? It seems at the moment is the kind technical person that wants to share and help others to easily deploy tools. This is great but we should look for additional and better selling use cases. I see Biodoker could be a good friend of journals. Journals are becoming very supportive with reproducible research and I think Biodocker provides a very good solution: Encapsulation of the research outcomes of experiment including software, workflows, data analysis, etc. Let's engage with publishers.

Recognition

The current biodocker registry do not highlight at all the contribution of people. Look at the Biojs registry and you will see people come first. People with their names and their pictures associated to the tools they create. Let's make a registry recognising people and their contributions.

Engagement in events

We need to engage with the community in hackathons, workshops, technical tracks, ... specially with the community interested in this topic. We have an opportunity in ... http://www.igst.it/nettab/2016/programme/hackathon/ ... we need Biodocker people there!

lgatto commented 7 years ago

I think @rajido touches on essential points. This will help us, who are already (somehow) involved, and will broaden the scope of the project. Putting efforts in community building and use cases that make the effort relevant to non-bioinformaticians is essential.

ypriverol commented 7 years ago

Hi @BioContainers/contributors: I would like to give a quick update on this issue, taking into account some discussion and request from users and ELIXIR Community. The last update was in Sep 2016, then we want to comment on this topic:

  1. We have two (quay.io and dockerhub) registries right now and the only way to know where to find the containers are by going to the registry/website. It would be great to have a command line tool that can be used to find our containers (in some way something like the web interface but from the commandline), any volunteer?

  2. During the next months, we need to define a unique way to define the version in the labeling of the container. If we have in the future an OpenMS 2.1 in Dockerhub and Quay.io it should be seen like replicates of the same image and not two different containers, that is impossible if we have the same version in both sides. For example, if you see the registry now for OpenMS 2.0 we have both versions in Quay.io and dockehub completely different:

    • DockerHub : LABEL version="2"
    • Quay.io : 2.1.0--py35_boost1.63_0
  3. Implementation errors to be solve:

    • [ ] Related with the previous topic. I agree with @bgruening to tag every quay.io with the latest, which is important for the users. last version (will be always the latest push to the quay.io).
    • [ ] Empty images/containers. (curation to remove the empty). This need to be solve by the @bgruening auto-mulled. Ideally this should be fixed by improving the Conda recipe. volunteers?
    • [ ] Refine the script of description, in the about the url in about called home. storage information, of variables in travis and define a timeline. (i) check in the api of bioconda the latest packages updated. (ii) https://github.com/galaxyproject/galaxy-lib/blob/master/galaxy/tools/deps/mulled/mulled_build_channel.py (@ypriverol)
  4. BioContainers and BioConda join statement. @bgruening and me have agreed to make a public statemenet to put both communities in line and talk about both of them together everywere. This will increase the number of people involve in @bioconainters, also will enable to optimize the mechanism we use to create automatic containers.

  5. Create containers for singularity and discuss where we can deploy with them of with us. Apart of having the dockerhub containers, the quay.io slim containers, we need to move forward the creating of singularity containers. @bgruening think that the best way of doing this is by creating the singularity images from the bioconda packages as we did before with other images. We need to talk with the singularity project to see where we can registry this containers. Probably @gmkurtzer can help us on this issue.
    (Note from Bjoern, we were able to convert 2000 BioContainers to Singularity ones given the nice converter tool from singularity.)

  6. Training: Training as highlighted by @lgatto @rajido should be the main priority for the team. Now many people are familiar with containers and how to use it, then we need to give priority to this topic. So far I see some ways of moving forward this (let's call a roadmap):

@rajido, Can be possible to run a hackathon on BioContainers when we put all the BioContainers members and discuss future ideas and develop some of these tools?