BioContainers / containers

Bioinformatics containers
http://biocontainers.pro
Apache License 2.0
674 stars 246 forks source link

CADD scripts with bundled required environments #546

Closed fa2k closed 9 months ago

fa2k commented 9 months ago

Submitting a Container

Attempt to address #528.

Warning: I'm creating a container with the same name as the existing cadd-scripts from bioconda (https://biocontainers.pro/tools/cadd-scripts). The hope that I can replace it with an improved container image. I hope this is fine, or we could maybe rename this to cadd-scripts-bundle or some other name. See the issue and first checklist item for the rationale.

Check BioContainers' Dockerfile specifications

Checklist

  1. Misc

    • [ ] My tool doesn't exist in BioConda
    • cadd-scripts exists in BioConda, but has problems functioning as a container image. (1) it requires the conda command, which is not included in the current quay.io/biocontainers/cadd-scripts:1.6--hdfd78af_1. (2) It requires write-access to the container filesystem when executed.
    • [x] The image can be built
  2. Metadata

    • [x] LABEL base_image
    • [x] LABEL version
    • [x] LABEL software.version
    • [x] LABEL about.summary
    • [x] LABEL about.home
    • [x] LABEL about.license
    • [x] MAINTAINER
  3. Extra (optionals)

    • [ ] I have written tests in test-cmds.txt
    • [x] LABEL extra.identifier
    • [x] LABEL about.documentation
    • [x] LABEL about.license_file
    • [x] LABEL about.tags
fa2k commented 9 months ago

One question may be: Why not "fix" the bioconda package upstream instead, and let the container images be generated from the bioconda package? My reason is that the existing cadd-scripts works great as a bioconda package, because the conda command (required by the snakemake pipeline in the CADD scripts) is available. I think that bundling another conda command with the bioconda package could cause some havoc.

Edit to add: I'm not affiliated with CADD in any way, just a user.

mboudet commented 9 months ago

Seems ok, to me, but there should be a name difference between the official 'cadd-script' and this image, to avoid confusion I think.

(This means changing the LABEL software=, and the directory name to match)

fa2k commented 9 months ago

Thanks for the review! I have another question: Is there a place I can add some usage documenation for the image itself?

The user has to mount their CADD reference data into /opt/conda/share/cadd-scripts-1.6-1/data/annotations to use it (and that's not really obvious).

mboudet commented 9 months ago

Hmm, you should be able to add a README.md in the 'cadd-scripts-with-envs' folder, but I'm not sure if this will be enough visibility-wise?

biocontainers-bot commented 9 months ago

about.license field is not in spdx list: https://spdx.org/licenses/, if it is a typo error, please fix it. If this is not a standard license, please specify Custom License and use about.license_file label to specify license location (in container or url).

biocontainers-bot commented 9 months ago

No biotools label defined, please check if tool is not already defined in biotools (https://bio.tools) and add extra.identifiers.biotools label if it exists. If it is not defined, you can ignore this comment.

biocontainers-bot commented 9 months ago

No test-cmds.txt (test file) present, skipping tests

fa2k commented 9 months ago

Thanks for the suggestions about README and micromamba clean! I've now fixed the issues, but it doesn't build on ARM64 arch. I'm troubleshooting, so it can wait a bit.

fa2k commented 9 months ago

I'm troubleshooting, so it can wait a bit.

I could get tabix installed, but mamba is unable to solve CADD's environment in the arm64 arch - so I think this container will only be supported on x86_64.

mboudet commented 9 months ago

It should be fine to only build x86_64. I believe as a whole, biocontainers does not support arm64 build anyway (for now) (https://github.com/BioContainers/containers/issues/425)

biocontainers-bot commented 9 months ago

No biotools label defined, please check if tool is not already defined in biotools (https://bio.tools) and add extra.identifiers.biotools label if it exists. If it is not defined, you can ignore this comment.

biocontainers-bot commented 9 months ago

No test-cmds.txt (test file) present, skipping tests

mboudet commented 9 months ago

Should be fine to merge on my end.