Closed pettyalex closed 2 weeks ago
Could you please mark this PR as draft? The dockerfile doesn't build successfully yet (according to the GH Actions log) and I think it will require some edits prior to review from our team.
We would love to have additional tests for these features built into the dockerfile, preferably in the test
stage of the dockerfile
And my last thought - it may be good to also update the samtools and htslib dockerfiles as well as I imagine they are also missing these features (I have not checked though, don't quote me). Can be done as part of this PR or separately.
@pettyalex Thank you for raising this issue and making a pull request. GCS/S3 and libdeflate support are important features that we missed while building version 1.20. As a general principle, we avoid overwriting images we created before because we don't want to break people's pipelines and validations. Another common practice here is the "one tool, one PR". It is very easy to miss something in a crowded pull request. I personally check the build logs beside the tests at the end to catch the silent errors.
So, I will request a few changes from you:
I have revisited the bcftools GitHub and rechecked installation notes. I think this is a good chance to enable other features too. Please see my changes below.
Lastly and optionally, you can add your email and name to maintainer labels.
Any further tests, recommendations, and feedback will be appreciated. Thank you,
# for easy upgrade later. ARG variables only persist during build time
ARG BCFTOOLS_VER="1.20"
FROM ubuntu:jammy as builder
# re-instantiate variable
ARG BCFTOOLS_VER
# install dependencies, cleanup apt garbage
RUN apt-get update && apt-get install --no-install-recommends -y \
wget \
ca-certificates \
perl \
bzip2 \
autoconf \
automake \
make \
gcc \
zlib1g-dev \
libbz2-dev \
liblzma-dev \
libcurl4-gnutls-dev \
libssl-dev \
libperl-dev \
libgsl0-dev \
libdeflate-dev \
procps && \
rm -rf /var/lib/apt/lists/* && apt-get autoclean
# download, compile, and install bcftools
RUN wget https://github.com/samtools/bcftools/releases/download/${BCFTOOLS_VER}/bcftools-${BCFTOOLS_VER}.tar.bz2 && \
tar -xjf bcftools-${BCFTOOLS_VER}.tar.bz2 && \
rm -v bcftools-${BCFTOOLS_VER}.tar.bz2 && \
cd bcftools-${BCFTOOLS_VER} && \
./configure --enable-libgsl --enable-perl-filters &&\
make && \
make install && \
make test
### start of app stage ###
FROM ubuntu:jammy as app
# re-instantiate variable
ARG BCFTOOLS_VER
# putting the labels in
LABEL base.image="ubuntu:jammy"
LABEL dockerfile.version="1"
LABEL software="bcftools"
LABEL software.version="${BCFTOOLS_VER}"
LABEL description="Variant calling and manipulating files in the Variant Call Format (VCF) and its binary counterpart BCF"
LABEL website="https://github.com/samtools/bcftools"
LABEL license="https://github.com/samtools/bcftools/blob/develop/LICENSE"
LABEL maintainer="Erin Young"
LABEL maintainer.email="eriny@utah.gov"
LABEL maintainer2="Curtis Kapsak"
LABEL maintainer2.email="kapsakcj@gmail.com"
# install dependencies required for running bcftools
# https://github.com/samtools/bcftools/blob/develop/INSTALL#L29
RUN apt-get update && apt-get install --no-install-recommends -y \
perl\
zlib1g \
gsl-bin \
bzip2 \
liblzma5 \
libcurl4-gnutls-dev \
libdeflate0 \
procps \
&& apt-get autoclean && rm -rf /var/lib/apt/lists/*
# copy in bcftools executables from builder stage
COPY --from=builder /usr/local/bin/* /usr/local/bin/
# copy in bcftools plugins from builder stage
COPY --from=builder /usr/local/libexec/bcftools/* /usr/local/libexec/bcftools/
# set locale settings for singularity compatibility
ENV LC_ALL=C
# set final working directory
WORKDIR /data
# default command is to pull up help optoins
CMD ["bcftools", "--help"]
### start of test stage ###
FROM app as test
# running --help and listing plugins
RUN bcftools --help && bcftools plugin -lv
# install wget for downloading test files
RUN apt-get update && apt-get install -y wget vcftools
RUN echo "downloading test SC2 BAM and FASTA and running bcftools mpileup and bcftools call test commands..." && \
wget -q https://raw.githubusercontent.com/artic-network/artic-ncov2019/master/primer_schemes/nCoV-2019/V4/SARS-CoV-2.reference.fasta && \
wget -q https://raw.githubusercontent.com/StaPH-B/docker-builds/master/tests/SARS-CoV-2/SRR13957123.primertrim.sorted.bam && \
bcftools mpileup -A -d 200 -B -Q 0 -f SARS-CoV-2.reference.fasta SRR13957123.primertrim.sorted.bam | \
bcftools call -mv -Ov -o SRR13957123.vcf
RUN echo "testing plugins..." && \
bcftools +counts SRR13957123.vcf
RUN echo "testing polysomy..." && \
wget https://samtools.github.io/bcftools/howtos/cnv-calling/usage-example.tgz &&\
tar -xvf usage-example.tgz &&\
zcat test.fcr.gz | ./fcr-to-vcf -b bcftools -a map.tab.gz -o outdir/ &&\
bcftools cnv -o cnv/ outdir/test.vcf.gz &&\
bcftools polysomy -o psmy/ outdir/test.vcf.gz &&\
head psmy/dist.dat
RUN echo "reading test data from Google Cloud to validate GCS support" && \
bcftools head -h 20 gs://genomics-public-data/references/hg38/v0/1000G_phase1.snps.high_confidence.hg38.vcf.gz
RUN echo "reading test data from S3 to validate AWS support" && \
bcftools head -h 20 s3://human-pangenomics/T2T/CHM13/assemblies/variants/GATK_CHM13v2.0_Resource_Bundle/resources-broad-hg38-v0-1000G_phase1.snps.high_confidence.hg38.t2t-chm13-v2.0.vcf.gz
Thank you for the feedback!
About libcurl4-gnutls-dev
vs libcurl3-gnutls
: https://askubuntu.com/questions/469360/what-is-the-difference-between-libcurl3-and-libcurl4
Libcurl3 is ABI compatible with libcurl4, so the name of the compiled library has not been incremented. That means that libcurl3-gnutls is the correct runtime library for libcurl4-gnutls-dev, and if you look in the libcurl4-gnutls-dev package it indeed contains libcurl3-gnutls
@pettyalex Thank you very much for the changes. This looks great!
I need one minor change as you see in the checklist. You will need to add <li>[1.20.c](./bcftools/1.20.c/)</li>
to main README.md line 120 as below. If you enable "Allow edits from maintainers", I can make any more cosmetic changes if necessary.
I will merge and deploy this image. Thanks!
Before:
| [bcftools](https://hub.docker.com/r/staphb/bcftools/) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/bcftools)](https://hub.docker.com/r/staphb/bcftools) | <ul><li>[1.10.2](./bcftools/1.10.2/)</li><li>[1.11](./bcftools/1.11/)</li><li>[1.12](./bcftools/1.12/)</li><li>[1.13](./bcftools/1.13/)</li><li>[1.14](./bcftools/1.14/)</li><li>[1.15](./bcftools/1.15/)</li><li>[1.16](./bcftools/1.16/)</li><li>[1.17](./bcftools/1.17/)</li><li>[1.18](bcftools/1.18/)</li><li>[1.19](./bcftools/1.19/)</li><li>[1.20](./bcftools/1.20/)</li></ul> | https://github.com/samtools/bcftools |
After:
| [bcftools](https://hub.docker.com/r/staphb/bcftools/) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/bcftools)](https://hub.docker.com/r/staphb/bcftools) | <ul><li>[1.10.2](./bcftools/1.10.2/)</li><li>[1.11](./bcftools/1.11/)</li><li>[1.12](./bcftools/1.12/)</li><li>[1.13](./bcftools/1.13/)</li><li>[1.14](./bcftools/1.14/)</li><li>[1.15](./bcftools/1.15/)</li><li>[1.16](./bcftools/1.16/)</li><li>[1.17](./bcftools/1.17/)</li><li>[1.18](bcftools/1.18/)</li><li>[1.19](./bcftools/1.19/)</li><li>[1.20](./bcftools/1.20/)</li><li>[1.20.c](./bcftools/1.20.c/)</li></ul> | https://github.com/samtools/bcftools |
@pettyalex Thank you for your contribution! You can check the image deployment from here: https://github.com/StaPH-B/docker-builds/actions/runs/10493960570. The image will be available on both Dockerhub and Quay.io
Enable AWS S3, GCS, and libdeflate support for bcftools by running ./configure before compiling
This fixes https://github.com/StaPH-B/docker-builds/issues/1018
If you want to merge this, I don't see a way to mark another build number for an already published package, but I'd be glad to update that if it exists.
I'd also be glad to add tests that test reading from AWS S3 or GCS storage directly to validate that these features are working.
Pull Request (PR) checklist:
docker build --tag samtools:1.15test --target test docker-builds/samtools/1.15
)spades/3.12.0/Dockerfile
)shigatyper/2.0.1/test.sh
)spades/3.12.0/README.md
)