bioconda / bioconda-recipes

Conda recipes for the bioconda channel.
https://bioconda.github.io
MIT License
1.57k stars 3.05k forks source link

Join the team (post here to be added to the bioconda team) #1

Closed johanneskoester closed 5 years ago

johanneskoester commented 8 years ago

Everybody is welcome to contribute a package. Simply reply to this issue and you will be added to the bioconda team.

Edit: After you post here, we will email you an invitation through github to join the bioconda team. Click the link in that invitation in order to be added.

Edit: To have the Bioconda logo display in your profile, navigate to https://github.com/orgs/bioconda/people and find yourself. Click on 'Private' and select 'Public'.

hyeshik commented 8 years ago

Johannes, This is a great idea! I often felt the need of an OS-independent package catalogue for computational biology that works within the user space. After few minutes of trial with Conda, I could agree that it's a wonderful platform for such system. I would like to contribute packages that I use as one in the team. I have an experience as a package maintainer (so-called ports committer) in the FreeBSD operating system for ~5 years in the early 2000s.

johanneskoester commented 8 years ago

Great! I have send you an invitation.

daler commented 8 years ago

Hi Johannes - I'd like like help with this, too. I like the idea of having one obvious place to find and contribute bio-related conda packages. I already have a handful of conda packages that could be useful here (https://conda.binstar.org/daler).

johanneskoester commented 8 years ago

Hi Ryan! Great to hear that! I have sent you an invitation. Feel free to add your packages!

johanneskoester commented 8 years ago

Btw Hyeshik, please also create an anaconda.org account so I can add you to the bioconda team over there, as well.

daler commented 8 years ago

FYI, https://github.com/chapmanb/bcbio-conda has a lot of conda packages already. I'm not sure how much of that is tied specifically to versions needed in the bcbio-* packages though.

johanneskoester commented 8 years ago

Good to know! Maybe we can invite Brad to migrate their stuff once we have reached a critical mass with Bioconda.

hyeshik commented 8 years ago

@johanneskoester Thank you! I just created an account on anaconda.org.

johanneskoester commented 8 years ago

Great. Added you!

hyeshik commented 8 years ago

@johanneskoester I seems that I don't have a permission to write the recipes github repo. Can I get one? Thank you!

johanneskoester commented 8 years ago

Of course. Sorry for that, does it work now?

hyeshik commented 8 years ago

It works. Thank you!

dkoppstein commented 8 years ago

@johanneskoester, I am enthusiastic about this idea as well, especially since I've been using wonky Makefiles to deal with software environments for my company =) I'd love to contribute in any way possible, although I don't have much experience with conda apart from being an end-user. My anaconda.org account is dkoppstein.

johanneskoester commented 8 years ago

Hi, glad to hear that you want to join us, thanks! Conda packaging is really easy. Basically, it is just some metadata plus a shell script with the commands you would use if you install a tool manually.

chapmanb commented 8 years ago

Johannes; This is a great initiative, thank you for putting it together. I'd love to contribute as well. We have a lot of packages prepared for bcbio dependencies and could happily move over to more community driven packaging:

https://anaconda.org/bcbio

It may also be worth getting in touch with the CGAT folks, who have a wide variety of conda tools as well:

https://anaconda.org/cgat

What Linux platform do you target builds for? In bcbio we need to support people running on older platforms so build everything in a CentOS5 docker container. We have a ready to run script that does this, only re-building new recipes that are missing from anaconda.org:

https://github.com/chapmanb/bcbio-conda#readme

Thanks again for organizing this.

johanneskoester commented 8 years ago

Hi Brad, I'm very glad you want to join. We already thought about contacting you. Great work with the bcbio. So, feel free to move stuff over! Regarding the builds, that is a very good question. So far, my impression was that if we target e.g. linux-64 and include all dependencies for build and run as conda packages, the resulting builds should be independent of the underlying linux platform. E.g. conda is shipping its own libstdc++ etc, right? Am I missing something here?

Best, Johannes

chapmanb commented 8 years ago

Johannes; Thanks so much for including me. Regarding builds, unfortunately it is not that isolated. You'll compile against system packages which will cause failures on different systems. glibc is the most common cause of these. For instance, the current bedtools build fails to run on CentOS6:

~/test/anaconda/bin/bedtools 
/cm/shared/apps/bcbio/20141204-devel/data/anaconda/bin/bedtools: /lib64/libc.so.6: version `GLIBC_2.15' not found (required by /cm/shared/apps/bcbio/20141204-devel/data/anaconda/bin/bedtools)
/cm/shared/apps/bcbio/20141204-devel/data/anaconda/bin/bedtools: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by /cm/shared/apps/bcbio/20141204-devel/data/anaconda/bin/bedtools)

This is the same issue you're seeing in #2.

The anaconda folks build their packages on CentOS5 which avoids most of these issues (although they still do get system-specific things -- the curl package doesn't work on non-RedHat systems due to certificate differences). With a CentOS5 based automated build you can avoid the most common compatibility issues and I haven't had any problems with portability of packages so far. I could replicate the setup we have in bcbio-conda here if that would help.

johanneskoester commented 8 years ago

Hi Brad, thanks for the insight! But what about packages like glibc. If you manage to build against that, wouldn't bedtools work on all platforms, regardless of the system glibc?

johanneskoester commented 8 years ago

Oh, I have just seen that this package just seems to copy the system libs. Strange... Ok, in this case, it would be great if you could replicate your CentOS setup for us, thank you!

johanneskoester commented 8 years ago

Alternatively, we could try to get the binstar builds up and running. I experimented a bit with them, but I never managed to trigger a build. I can only submit them, but they don't seem to be assigned to a build queue.

dkoppstein commented 8 years ago

@johanneskoester, do you envision this being Linux-64 only or also Mac/Windows? If the latter, perhaps it makes sense to try to appropriate funds for organization status so we can use those types of build nodes (assuming we can get binstar to work).

FWIW I'm totally fine with it being Linux-64 only.

johanneskoester commented 8 years ago

I think linux is most important. I have uploaded some MacOS X builds because I need them for a Snakemake tutorial I am giving. But in general, I would say we should start with Linux and see where the journey takes us.

I would propose the following plan:

  1. Build for linux-64/CentOS 5 with Brad's setup.
  2. Once the automated binstar builds work, build for linux-64 via the binstar queue.
  3. Depending on user requests, decide at some point if there is a market for extending builds to MacOS X.

I think I won't go for Windows, because many tools don't support it at all. I also don't want to encourage people to use Win for bioinformatics ;-).

chapmanb commented 8 years ago

Johannes and David; Thanks for the thoughts on the build ideas. I pushed scripts which we can use for building and uploading packages on Linux and MacOSX. For Linux, it uses a CentOS 5 docker container which will provide hopefully widely compatible binaries. For MacOSX, we should just build directly on a Mac machine but this will require marking tools that don't build on Mac, or where the recipe uses Linux binaries.

I would love to eventually use the anaconda.org builds, but explored this a year ago and didn't have any luck getting it set up, even with offering to pay for build boxes. I got the feeling it's still under development on their side but the situation could have changed in the interim.

Let me know if you have any problems running the scripts, I'm happy to improve the docs or scripts as needed. Thanks again.

tomkinsc commented 8 years ago

I would like to join!

sauloal commented 8 years ago

Hey,

I think reproducibility is a hot topic now. We are also working on a similar project: https://github.com/BioDocker/biodocker

Apparently there's a huge double work here (and we are not the only ones).

It would be interesting to partner and for sure we could install your conda packages inside our docker images instead of downloading from source, reducing the download size.

Also, could you please tell me what are the advantages of conda over docker? I've personally never used it but the multiplatform nature and no "mount/port forward/etc" crap seem very nice.

regards

daler commented 8 years ago

The biggest disadvantage for me personally is that Docker is not allowed on our HPC cluster! In contrast, conda installs everything in my home directory without the need for any elevated privileges.

Conda is mostly for installing executables and libraries. You can't do things like run an isolated mysql server using conda. But that's exactly the sort of thing docker is good at.

While conceptually there's a lot of overlap between the projects, it looks like there's not much domain overlap yet: biodocker has lots of proteomic packages while bioconda has lots of sequencing packages. One way to minimize duplicated work while taking advantage of both projects would be to 1) port existing dockerfiles into conda packages under bioconda and then 2) pull conda packages into docker containers built under biodocker.

sauloal commented 8 years ago

@daler I see. Thank you very much. I was also struggling on how to run the programs on our HPC too. this is great.

BTW. indeed the initial creators are from the proteomics field but I'm from the genomics and I've started porting several programs: https://github.com/BioDocker/sandbox

I really like the idea to create packages here and installing them inside docker. makes the docker images lighter and the same package can be used inside docker, outside docker and in HPC. 3 in one.

I like this idea very much. We'll keep talking.

percyfal commented 8 years ago

Hi Johannes,

great initiative, I'd be pleased to chip in. I recently started using conda and I haven't gotten round to learning how to build packages, but looking at your recipes should help.

Cheers,

Per

johanneskoester commented 8 years ago

Hi @percyfal, @tomkinsc and @sauloal, Glad to hear that you are interested in the project! I have invited you to the github team. I can also provide access to the corresponding anaconda team if you give me your anaconda.org usernames. The latter is not urgent, we should soon have automatic builds so that direct interaction with anaconda might be rarely needed.

@sauloal: indeed, BioDocker sounds like a perfect complement for bioconda! I would be extremely happy to cooperate and advertise the two together!

xguse commented 8 years ago

I have been doing this for my groups stuff for a while! Glad to see a organized effort somewhere!

I would love to contribute the packages I have built that are not already represented! Please add me to the team!

Gus

percyfal commented 8 years ago

Great, thanks for adding me @johanneskoester. My username on anaconda is percyfal.

johanneskoester commented 8 years ago

Welcome Gus! Do you have the same username on anaconda.org? Then I can add you to the team there as well.

xguse commented 8 years ago

@johanneskoester : no I am gusdunn on anaconda.org. Thanks again!

Gus

johanneskoester commented 8 years ago

ok, added!

roryk commented 8 years ago

Thanks for setting this up everyone, can I help out? I'm roryk on anaconda.org and here.

johanneskoester commented 8 years ago

Welcome! I have added you to the teams.

kyleabeauchamp commented 8 years ago

The docs seem to suggest that people need to join the team to contribute, but I wonder if it makes sense to encourage people to file pull requests as well? AFAIK that seems like the easiest path to making a contribution.

johanneskoester commented 8 years ago

Good idea, I have added a sentence about that (in case somebody does not want to be a permanent team member).

sebastian-luna-valero commented 8 years ago

Dear @johanneskoester ,

Sorry for my late response. I would be very happy to contribute to bioconda on behalf of CGAT.

Here is our anaconda channel: https://anaconda.org/cgat

Many thanks and congratulations for this great initiative!

Best regards, Sebastian.

johanneskoester commented 8 years ago

Great, I will add you to the team! Welcome!

ostrokach commented 8 years ago

This looks like a great idea and should avoid a lot of duplicated effort. Can I join in as well? My github and anaconda usernames are both ostrokach.

BTW, I think something similar was discussed on the conda email list a while back. It didn't get too far, but there are a few teams there that might be interested in helping out (e.g. ioos, omnia, tacaswell).

brentp commented 8 years ago

Hi, could you add me to the team? I'm brentp on anaconda.org as well. Thanks for creating this project.

johanneskoester commented 8 years ago

Welcome!

guillermo-carrasco commented 8 years ago

Hi,

Could you please add me to the team? I'd like to contribute to the community by uploading bcbio-monitor to begin with.

Thanks!

johanneskoester commented 8 years ago

Sure, glad to have a new contributor!

guillermo-carrasco commented 8 years ago

Thank you @johanneskoester !

moonso commented 8 years ago

Hello, I would also like to join the team!

Thank you

robinandeer commented 8 years ago

Hi, I was recommended by @guillermo-carrasco to add chanjo and some future exciting tools to the recipes

Can I join the team? :ocean:

johanneskoester commented 8 years ago

Welcome, you two! I have sent an invitation!

moonso commented 8 years ago

:+1: